StanceVis Prime: visual analysis of sentiment and stance in social media texts

Kostiantyn Kucher1, Rafael Messias Martins1, Carita Paradis2, Andreas Kerren1
1Department of Computer Science and Media Technology, Linnaeus University, Växjö, Sweden
2Centre for Languages and Literature, Lund University, Lund, Sweden

Tóm tắt

Abstract

Text visualization and visual text analytics methods have been successfully applied for various tasks related to the analysis of individual text documents and large document collections such as summarization of main topics or identification of events in discourse. Visualization of sentiments and emotions detected in textual data has also become an important topic of interest, especially with regard to the data originating from social media. Despite the growing interest in this topic, the research problem related to detecting and visualizing various stances, such as rudeness or uncertainty, has not been adequately addressed by the existing approaches. The challenges associated with this problem include the development of the underlying computational methods and visualization of the corresponding multi-label stance classification results. In this paper, we describe our work on a visual analytics platform, called StanceVis Prime, which has been designed for the analysis of sentiment and stance in temporal text data from various social media data sources. The use case scenarios intended for StanceVis Prime include social media monitoring and research in sociolinguistics. The design was motivated by the requirements of collaborating domain experts in linguistics as part of a larger research project on stance analysis. Our approach involves consuming documents from several text stream sources and applying sentiment and stance classification, resulting in multiple data series associated with source texts. StanceVis Prime provides the end users with an overview of similarities between the data series based on dynamic time warping analysis, as well as detailed visualizations of data series values. Users can also retrieve and conduct both distant and close reading of the documents corresponding to the data series. We demonstrate our approach with case studies involving political targets of interest and several social media data sources and report preliminary user feedback received from a domain expert.

Graphic abstract

Từ khóa


Tài liệu tham khảo

Aigner W, Miksch S, Schumann H, Tominski C (2011) Visualization of time-oriented data. Springer, Berlin. https://doi.org/10.1007/978-0-85729-079-3

Alencar AB, Börner K, Paulovich FV, de Oliveira MCF (2012) Time-aware visualization of document collections. In: Proceedings of the 27th annual ACM symposium on applied computing, ACM, SAC’12, pp 997–1004. https://doi.org/10.1145/2245276.2245469

Alspaugh S, Zokaei N, Liu A, Jin C, Hearst MA (2019) Futzing and moseying: interviews with professional data analysts on exploration practices. IEEE Trans Vis Comput Graphics 25(1):22–31. https://doi.org/10.1109/TVCG.2018.2865040

Bernard J, Wilhelm N, Scherer M, May T, Schreck T (2012) TimeSeriesPaths: projection-based explorative analysis of multivariate time series data. J WSCG 20(2):97–106

Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: Proceedings of the AAAI workshop on knowledge discovery in databases, AAAI Press, KDD’94, pp 359–370

Biber D, Finegan E (1989) Styles of stance in English: lexical and grammatical marking of evidentiality and affect. Interdiscip J Study Discourse 9(1):93–124. https://doi.org/10.1515/text.1.1989.9.1.93

Borg I, Groenen PJF (2005) Modern multidimensional scaling: theory and applications. Springer, Berlin. https://doi.org/10.1007/0-387-28981-X

Bostock M (2011) D3—data-driven documents. https://d3js.org/. Accessed 28 July 2020

Brewer C, Harrower M, The Pennsylvania State University (2009) ColorBrewer 2.0—color advice for cartography. http://colorbrewer2.org/. Accessed 28 July 2020

Byron L, Wattenberg M (2008) Stacked graphs: geometry & aesthetics. IEEE Trans Vis Comput Graphics 14(6):1245–1252. https://doi.org/10.1109/TVCG.2008.166

Cao N, Lin YR, Sun X, Lazer D, Liu S, Qu H (2012) Whisper: tracing the spatiotemporal process of information diffusion in real time. IEEE Trans Vis Comput Graphics 18(12):2649–2658. https://doi.org/10.1109/TVCG.2012.291

Cao N, Lu L, Lin YR, Wang F, Wen Z (2015) SocialHelix: visual analysis of sentiment divergence in social media. J Vis 18(2):221–235. https://doi.org/10.1007/s12650-014-0246-x

Cao N, Shi C, Lin S, Lu J, Lin YR, Lin CY (2016) TargetVue: visual analysis of anomalous user behaviors in online communication systems. IEEE Trans Vis Comput Graphics 22(1):280–289. https://doi.org/10.1109/TVCG.2015.2467196

Chatzimparmpas A, Martins RM, Jusufi I, Kucher K, Rossi F, Kerren A (2020) The state of the art in enhancing trust in machine learning models with the use of visualizations. Comput Graphics Forum 39(3):713–756. https://doi.org/10.1111/cgf.14034

Chen WF, Ku LW (2016) UTCNN: a deep learning model of stance classification on social media text. In: Proceedings of the 26th international conference on computational linguistics—technical papers, ACL, COLING 2016, pp 1635–1645

Chen S, Lin L, Yuan X (2017) Social media visual analytics. Comput Graphics Forum 36(3):563–587. https://doi.org/10.1111/cgf.13211

Chen S, Li J, Andrienko G, Andrienko N, Wang Y, Nguyen PH, Turkay C (2018) Supporting story synthesis: bridging the gap between visual analytics and storytelling. IEEE Trans Vis Comput Graphics. https://doi.org/10.1109/TVCG.2018.2889054

Crnovrsanin T, Muelder C, Correa C, Ma KL (2009) Proximity-based visualization of movement trace data. In: Proceedings of the IEEE symposium on visual analytics science and technology, VAST’09, pp 11–18. https://doi.org/10.1109/VAST.2009.5332593

Cuenca E, Sallaberry A, Wang FY, Poncelet P (2018) MultiStream: a multiresolution streamgraph approach to explore hierarchical time series. IEEE Trans Vis Comput Graphics 24(12):3160–3173. https://doi.org/10.1109/TVCG.2018.2796591

Cui W, Liu S, Tan L, Shi C, Song Y, Gao Z, Qu H, Tong X (2011) TextFlow: towards better understanding of evolving topics in text. IEEE Trans Vis Comput Graphics 17(12):2412–2421. https://doi.org/10.1109/TVCG.2011.239

Diakopoulos N, Zhang AX, Elgesem D, Salway A (2014) Identifying and analyzing moral evaluation frames in climate change blog discourse. In: Proceedings of the eighth international AAAI conference on weblogs and social media, AAAI, ICWSM’14, pp 583–586

Dörk M, Gruen D, Williamson C, Carpendale S (2010) A visual backchannel for large-scale events. IEEE Trans Vis Comput Graphics 16(6):1129–1138. https://doi.org/10.1109/TVCG.2010.129

Dou W, Liu S (2016) Topic- and time-oriented visual text analysis. IEEE Comput Graphics Appl 36(4):8–13. https://doi.org/10.1109/MCG.2016.73

El-Assady M, Gold V, Acevedo C, Collins C, Keim DA (2016) ConToVi: multi-party conversation exploration using topic-space views. Comput Graphics Forum 35(3):431–440. https://doi.org/10.1111/cgf.12919

El-Assady M, Sevastjanova R, Keim D, Collins C (2018) ThreadReconstructor: modeling reply-chains to untangle conversational text through visual analytics. Comput Graphics Forum 37(3):351–365. https://doi.org/10.1111/cgf.13425

Englebretson R (ed) (2007) Stancetaking in discourse: subjectivity, evaluation, interaction, pragmatics & beyond new series, vol 164. John Benjamins, Amsterdam. https://doi.org/10.1075/pbns.164

Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):12:1–12:34. https://doi.org/10.1145/2379776.2379788

Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, AAAI Press, KDD’96, pp 226–231

Felix C, Franconeri S, Bertini E (2018) Taking word clouds apart: an empirical investigation of the design space for keyword summaries. IEEE Trans Vis Comput Graphics 24(1):657–666. https://doi.org/10.1109/TVCG.2017.2746018

Glynn D, Sjölin M (eds) (2014) Subjectivity and epistemicity: corpus, discourse, and literary approaches to stance. Lund studies in English. Lund University Press, Lund

Havre S, Hetzler B, Nowell L (2000) ThemeRiver: visualizing theme changes over time. In: Proceedings of the IEEE symposium on information visualization, IEEE, InfoVis’00, pp 115–123. https://doi.org/10.1109/INFVIS.2000.885098

Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression. Wiley, Hoboken. https://doi.org/10.1002/9781118548387

Hutto C, Gilbert E (2014) VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the eighth international AAAI conference on weblogs and social media, AAAI, ICWSM’14

Jäckle D, Fischer F, Schreck T, Keim DA (2016) Temporal MDS plots for analysis of multivariate data. IEEE Trans Vis Comput Graphics 22(1):141–150. https://doi.org/10.1109/TVCG.2015.2467553

Jänicke S, Franzini G, Cheema MF, Scheuermann G (2015) On close and distant reading in digital humanities: a survey and future challenges. In: Proceedings of the EG/VGTC conference on visualization—STARs, The Eurographics Association, EuroVis’15. https://doi.org/10.2312/eurovisstar.20151113

Krzanowski WJ (2000) Principles of multivariate analysis. Oxford statistical science series. Oxford University Press, Oxford

Kucher K, Kerren A (2015) Text visualization techniques: taxonomy, visual survey, and community insights. In: Proceedings of the 8th IEEE Pacific visualization symposium, IEEE, PacificVis’15, pp 117–121. https://doi.org/10.1109/PACIFICVIS.2015.7156366

Kucher K, Schamp-Bjerede T, Kerren A, Paradis C, Sahlgren M (2016) Visual analysis of online social media to open up the investigation of stance phenomena. Inf Vis 15(2):93–116. https://doi.org/10.1177/1473871615575079

Kucher K, Paradis C, Sahlgren M, Kerren A (2017) Active learning and visual analytics for stance classification with ALVA. ACM Trans Interact Intell Syst 7(3):141–1431. https://doi.org/10.1145/3132169

Kucher K, Paradis C, Kerren A (2018a) The state of the art in sentiment visualization. Comput Graphics Forum 37(1):71–96. https://doi.org/10.1111/cgf.13217

Kucher K, Paradis C, Kerren A (2018b) Visual analysis of sentiment and stance in social media texts. In: Poster abstracts of the EG/VGTC conference on visualization, The Eurographics Association, EuroVis’18, pp 49–51. https://doi.org/10.2312/eurp.20181127

Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Bonet B, Koenig S (eds) Proceedings of the twenty-ninth AAAI conference on artificial intelligence, AAAI, AAAI’15

Liu S, Wu Y, Wei E, Liu M, Liu Y (2013) StoryFlow: tracking the evolution of stories. IEEE Trans Vis Comput Graphics 19(12):2436–2445. https://doi.org/10.1109/TVCG.2013.196

Liu J, Chang WC, Wu Y, Yang Y (2017) Deep learning for extreme multi-label text classification. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, ACM, SIGIR’17, pp 115–124. https://doi.org/10.1145/3077136.3080834

Liu S, Wang X, Collins C, Dou W, Ouyang F, El-Assady M, Jiang L, Keim DA (2019) Bridging text visualization and mining: a task-driven survey. IEEE Trans Vis Comput Graphics 25(7):2482–2504. https://doi.org/10.1109/TVCG.2018.2834341

Lu Y, Garcia R, Hansen B, Gleicher M, Maciejewski R (2017) The state-of-the-art in predictive visual analytics. Comput Graphics Forum 36(3):539–562. https://doi.org/10.1111/cgf.13210

Lu Y, Wang H, Landis S, Maciejewski R (2018) A visual analytics framework for identifying topic drivers in media events. IEEE Trans Vis Comput Graphics 24(9):2501–2515. https://doi.org/10.1109/TVCG.2017.2752166

Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT Press, Cambridge

Martins RM, Kerren A (2018) Efficient dynamic time warping for big data streams. In: Proceedings of the 3rd workshop on real-time  & stream analytics in big data  & stream data management at IEEE Big Data’18, pp 2924–2929. https://doi.org/10.1109/BigData.2018.8621878

Martins RM, Simaki V, Kucher K, Paradis C, Kerren A (2017) StanceXplore: visualization for the interactive exploration of stance in social media. In: Proceedings of the 2nd workshop on visualization for the digital humanities, VIS4DH’17

Mohammad SM (2016) Sentiment analysis: detecting valence, emotions, and other affectual states from text. In: Meiselman HL (ed) Emotion measurement. Woodhead Publishing, Sawston, pp 201–237. https://doi.org/10.1016/B978-0-08-100508-8.00009-6

Mohammad SM, Kiritchenko S, Sobhani P, Zhu X, Cherry C (2016) SemEval-2016 task 6: detecting stance in tweets. In: Proceedings of the international workshop on semantic evaluation, SemEval’16

Mohammad SM, Sobhani P, Kiritchenko S (2017) Stance and sentiment in tweets. ACM Trans Internet Technol 17(3):26:1–26:23. https://doi.org/10.1145/3003433

Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135. https://doi.org/10.1561/1500000011

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay Ë (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

Pirolli P, Card S (2005) The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In: Proceedings of the international conference on intelligence analysis, vol 5

Rauber PE, Falcão AX, Telea AC (2016) Visualizing time-dependent data using dynamic t-SNE. In: Short papers of the EG/VGTC conference on visualization, The Eurographics Association, EuroVis’16. https://doi.org/10.2312/eurovisshort.20161164

Roberts JC (2007) State of the art: coordinated & multiple views in exploratory visualization. In: Proceedings of the fifth international conference on coordinated and multiple views in exploratory visualization, IEEE, CMV’07, pp 61–71. https://doi.org/10.1109/CMV.2007.20

Russell DM (2016) Simple is good: Observations of visualization use amongst the Big Data digerati. In: Proceedings of the international working conference on advanced visual interfaces, ACM, AVI’16, pp 7–12. https://doi.org/10.1145/2909132.2933287

Sacha D, Stoffel A, Stoffel F, Kwon BC, Ellis G, Keim DA (2014) Knowledge generation model for visual analytics. IEEE Trans Vis Comput Graphics 20(12):1604–1613. https://doi.org/10.1109/TVCG.2014.2346481

Sagi O, Rokach L (2018) Ensemble learning: a survey. WIREs Data Min Knowl Discov 8(4):e1249. https://doi.org/10.1002/widm.1249

Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523. https://doi.org/10.1016/0306-4573(88)90021-0

Shi C, Cui W, Liu S, Xu P, Chen W, Qu H (2012) RankExplorer: visualization of ranking changes in large time series data. IEEE Trans Vis Comput Graphics 18(12):2669–2678. https://doi.org/10.1109/TVCG.2012.253

Shrestha A, Miller B, Zhu Y, Zhao Y (2013) Storygraph: extracting patterns from spatio-temporal data. In: Proceedings of the ACM SIGKDD workshop on interactive data exploration and analytics, ACM, IDEA’13, pp 95–103. https://doi.org/10.1145/2501511.2501525

Shutterstock Images, LLC (2011) Rickshaw: a JavaScript toolkit for creating interactive time-series graphs. https://github.com/shutterstock/rickshaw. Accessed 28 July 2020

Silvia S, Etemadpour R, Abbas J, Huskey S, Weaver C (2016) Visualizing variation in classical text with force directed storylines. In: Proceedings of the 1st workshop on visualization for the digital humanities, VIS4DH’16

Simaki V, Paradis C, Skeppstedt M, Sahlgren M, Kucher K, Kerren A (2017) Annotating speaker stance in discourse: the Brexit blog corpus. Corpus Linguist Linguist Theory. https://doi.org/10.1515/cllt-2016-0060

Skeppstedt M, Paradis C, Kerren A (2016a) PAL, a tool for pre-annotation and active learning. J Lang Technol Comput Linguist 31(1):91–110

Skeppstedt M, Sahlgren M, Paradis C, Kerren A (2016b) Active learning for detection of stance components. In: Proceedings of the workshop on computational modeling of people’s opinions, personality, and emotions in social media at COLING’16, ACL, PEOPLES’16, pp 50–59

Skeppstedt M, Simaki V, Paradis C, Kerren A (2017) Detection of stance and sentiment modifiers in political blogs. In: Proceedings of the international conference on speech and computer. Springer, SPECOM’17, pp 302–311. https://doi.org/10.1007/978-3-319-66429-3_29

Tanahashi Y, Ma KL (2012) Design considerations for optimizing storyline visualizations. IEEE Trans Vis Comput Graphics 18(12):2679–2688. https://doi.org/10.1109/TVCG.2012.212

Tory M, Möller T (2005) Evaluating visualizations: do expert reviews work? IEEE Comput Graphics Appl 25(5):8–11. https://doi.org/10.1109/MCG.2005.102

Tufte ER (2006) Beautiful evidence. Graphics Press, Cheshire

Tukey JW (1977) Exploratory data analysis. Addison-Wesley Publishing Company, Boston

Wall E, Agnihotri M, Matzen L, Divis K, Haass M, Endert A, Stasko J (2019) A heuristic approach to value-driven evaluation of visualizations. IEEE Trans Vis Comput Graphics 25(1):491–500. https://doi.org/10.1109/TVCG.2018.2865146

Wang X, Liu S, Chen Y, Peng TQ, Su J, Yang J, Guo B (2016a) How ideas flow across multiple social groups. In: Proceedings of the IEEE conference on visual analytics science and technology, IEEE, VAST’16, pp 51–60. https://doi.org/10.1109/VAST.2016.7883511

Wang X, Liu S, Liu J, Chen J, Zhu J, Guo B (2016b) TopicPanorama: a full picture of relevant topics. IEEE Trans Vis Comput Graphics 22(12):2508–2521. https://doi.org/10.1109/TVCG.2016.2515592

Wu Y, Wei F, Liu S, Au N, Cui W, Zhou H, Qu H (2010) OpinionSeer: interactive visualization of hotel customer feedback. IEEE Trans Vis Comput Graphics 16(6):1109–1118. https://doi.org/10.1109/TVCG.2010.183

Wu Y, Liu S, Yan K, Liu M, Wu F (2014) OpinionFlow: visual analysis of opinion diffusion on social media. IEEE Trans Vis Comput Graphics 20(12):1763–1772. https://doi.org/10.1109/TVCG.2014.2346920

Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1253. https://doi.org/10.1002/widm.1253

Zhao J, Gou L, Wang F, Zhou M (2014) PEARL: an interactive visual analytic tool for understanding personal emotion style derived from social media. In: Proceedings of the IEEE conference on visual analytics science and technology, IEEE, VAST’14, pp 203–212. https://doi.org/10.1109/VAST.2014.7042496