Kích thích trải nghiệm nghe nhạc thông qua các bình luận về bài hát trên các nền tảng phát nhạc trực tuyến

Longfei Chen1,2, Qianyu Liu2,1, Chenyang Zhang3, Yangkun Huang4, Zhenhui Peng5, Haipeng Zeng6, Zhida Sun7, Xiaojuan Ma8, Quan Li1,2
1Shanghai Engineering Research Center of Intelligent Vision and Imaging China, Shanghai, China
2School of Information Science and Technology, ShanghaiTech University, Shanghai, China
3[Department of Computer Science , University of Illinois at Urbana-Champaign, Champaign, USA]
4Tandon School of Engineering, New York University, New York, USA
5School of Artificial Intelligence, Sun Yat-Sen University, Guangzhou, China
6School of Intelligent Systems Engineering, Sun Yat-sen University, Guangzhou, China
7College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
8Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China

Tóm tắt

Dịch vụ phát nhạc trực tuyến ngày càng trở nên phổ biến trong giới trẻ, những người tìm kiếm trải nghiệm xã hội thông qua sự thể hiện cá nhân và chia sẻ những cảm xúc chủ quan trong các bình luận. Tuy nhiên, những khía cạnh cảm xúc như vậy thường bị các nền tảng hiện tại bỏ qua, điều này ảnh hưởng đến khả năng của người nghe trong việc tìm kiếm những bài nhạc gợi lại các cảm xúc cá nhân cụ thể. Để giải quyết vấn đề này, nghiên cứu này đề xuất một phương pháp mới sử dụng các phương pháp học sâu để nắm bắt các từ khóa ngữ cảnh, cảm xúc và cơ chế gợi lên từ các bình luận về bài hát. Nghiên cứu đã bổ sung cho một ứng dụng nhạc hiện tại với hai tính năng, bao gồm việc trình bày các thẻ đại diện tốt nhất cho các bình luận về bài hát và một phép ẩn dụ bản đồ mới sắp xếp lại các bình luận về bài hát dựa trên thứ tự thời gian, nội dung và cảm xúc. Tính hiệu quả của phương pháp được đề xuất được xác thực thông qua một kịch bản sử dụng và một nghiên cứu người dùng cho thấy khả năng cải thiện trải nghiệm của người dùng trong việc khám phá bài nhạc và duyệt các bình luận mà họ quan tâm. Nghiên cứu này đóng góp vào sự phát triển của các dịch vụ phát nhạc trực tuyến bằng cách mang đến trải nghiệm âm nhạc cá nhân hóa và giàu cảm xúc hơn cho thế hệ trẻ.

Từ khóa

#nhạc trực tuyến #học sâu #cảm xúc #bình luận bài hát #trải nghiệm người dùng

Tài liệu tham khảo

Alper B, Yang H, Haber E, Kandogan E (2011) Opinionblocks: visualizing consumer reviews. In: IEEE VisWeek 2011 workshop on interactive visual text analytics for decision making Barzilay R, Elhadad M (1999) Using lexical chains for text summarization. Adv Autom Text Summ:111–121 Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann Math Stat 41(1):164–171 Beliga S, Meštrović A, Martinčić-Ipšić S (2015) An overview of graph-based keyword extraction methods and approaches. J Inf Organ Sci 39(1):1–20 Bharti SK, Babu KS (2017) Automatic keyword extraction for text summarization: a survey. arXiv:1704.03242 Brigl T (2018) Extracting reliable topics using ensemble latent Dirichlet allocation. PhD thesis, Technische Hochschule Ingolstadt Byron L, Wattenberg M (2008) Stacked graphs-geometry & aesthetics. IEEE Trans Visual Comput Graphics 14(6):1245–1252 Chen S, Chen S, Lin L, Yuan X, Liang J, Zhang X (2017) E-map: a visual analytics approach for exploring significant event evolutions in social media. In: 2017 IEEE conference on visual analytics science and technology (VAST), pp 36–47. https://doi.org/10.1109/VAST.2017.8585638 Chen S, Chen S, Wang Z, Liang J, Yuan X, Cao N, Wu Y (2016) D-map: visual analysis of ego-centric information diffusion patterns in social media. In: 2016 IEEE conference on visual analytics science and technology (VAST), pp 41–50. https://doi.org/10.1109/VAST.2016.7883510 Chen S, Li S, Chen S, Yuan X (2019) R-map: A map metaphor for visualizing information reposting process in social media. IEEE Trans Visual Comput Graphics 26(1):1204–1214 Chen X (2018) Research on the characteristics and communication mode of interesting virtual community: take net ease cloud music as an example. In: 4th international symposium on social science (ISSS 2018). Atlantis Press, pp 408–412 Chuang Z-J, Wu C-H (2004) Multi-modal emotion recognition from speech and text, pp 45–62 Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297 Cui W, Liu S, Tan L, Shi C, Song Y, Gao Z, Qu H, Tong X (2011) Textflow: towards better understanding of evolving topics in text. IEEE Trans Visual Comput Graphics 17(12):2412–2421 Cui Y, Che W, Liu T, Qin B, Wang S, Hu G (2020) Revisiting pre-trained models for Chinese natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings. Association for Computational Linguistics, pp 657–668 Dörk M, Gruen D, Williamson C, Carpendale S (2010) A visual backchannel for large-scale events. IEEE Trans Visual Comput Graphics 16(6):1129–1138 Eckman P (1972) Universal and cultural differences in facial expression of emotion. In: Nebraska symposium on motivation, vol 19. University of Nebraska Press, pp 207–284 Ester M, Kriegel H.-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, KDD’96. AAAI Press, pp 226–231 Graves A, Jaitly N, Mohamed A.-r (2013) Hybrid speech recognition with deep bidirectional lstm. In: 2013 IEEE workshop on automatic speech recognition and understanding. IEEE, pp 273–278 Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm networks. In: Proceedings 2005 IEEE international joint conference on neural networks, 2005, vol 4. IEEE, pp 2047–2052 Grivet S, Auber D, Domenger J.-P, Melançon G (2006) Bubble tree drawing algorithm. In: Computer vision and graphics. Springer, pp 633–641 Grootendorst M (2020) Keybert: minimal keyword extraction with bert. https://doi.org/10.5281/zenodo.4461265 Hanada M (2018) Correspondence analysis of color-emotion associations. Color Res Appl 43(2):224–237 Hasan KS, Ng V (2014) Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1262–1273 Havre S, Hetzler E, Whitney P, Nowell L (2002) Themeriver: visualizing thematic changes in large document collections. IEEE Trans Visual Comput Graphics 8(1):9–20 Hruschka DJ, Schwartz D, St. John DC, Picone-Decaro E, Jenkins RA, Carey JW (2004) Reliability in coding open-ended data: lessons learned from hiv behavioral research. Field Methods 16(3):307–331 Hulth A (2003) Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 conference on Empirical methods in natural language processing, pp 216–223 Juslin PN (2013) From everyday emotions to aesthetic emotions: towards a unified theory of musical emotions. Phys Life Rev 10(3):235–266. https://doi.org/10.1016/j.plrev.2013.05.008 Keogh E, Chu S, Hart D, Pazzani M (2004) Segmenting time series: a survey and novel approach. In: textitData mining in time series databases. World Scientific, pp 1–21 Kucher K, Paradis C, Kerren A (2018) The state of the art in sentiment visualization. In: Computer graphics forum, vol 37. Wiley Online Library, pp 71–96 Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 497–506 Liu S, Zhou M. X, Pan S, Qian W, Cai W, Lian X (2009) Interactive, topic-based visual text summarization and analysis. In: Proceedings of the 18th ACM conference on information and knowledge management, pp 543–552 Marin MM, Bhattacharya J (2010) Music induced emotions: some current issues and cross-modal comparisons. Music Educ:1–38 Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. arXiv:cs/0205070 Ren Y, Harper FM, Drenner S, Terveen L, Kiesler S, Riedl J, Kraut RE (2012) Building member attachment in online communities: applying theories of group identity and interpersonal bonds. MIS Q:841–864 Scherer KR, Zentner MR (2001) Emotional effects of music: production rules Shneiderman B (1996) The eyes have it: A task by data type taxonomy for information visualizations. In: Proceedings 1996 IEEE symposium on visual languages. IEEE, pp 336–343 Siddiqi S, Sharan A (2015) Keyword and keyphrase extraction techniques: a literature review. Int J Comput Appl 109(2) Song G, Ye Y, Du X, Huang X, Bie S (2014) Short text classification: a survey. J Multimed 9(5):1–2 Sugiana D, Hafiar H (2018) Construction of self-identity and social identity of“koes plus’’ music fans. MIMBAR: Jurnal Sosial Dan Pembangunan 34(1):176–184 Uzun Y (2005) Keyword extraction using naive bayes. In: Bilkent University, Department of Computer Science, Turkey www. cs. bilkent. edu. tr/\(\sim \) guvenir/courses/CS550/Workshop/Yasin_Uzun. pdf Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv:1706.03762 Wang H, Fu R (2020) Exploring user experience of music social mode-take netease cloud music as an example. In: International conference on applied human factors and ergonomics. Springer, pp 993–999 Wang Y, Haleem H, Shi C, Wu Y, Zhao X, Fu S, Qu H (2018) Towards easy comparison of local businesses using online reviews. In: Proceedings of computer graphics forum, vol 37. Wiley Online Library, pp 63–74 Wei F, Liu S, Song Y, Pan S, Zhou MX, Qian W, Shi L, Tan L, Zhang Q (2010) Tiara: a visual exploratory text analytic system. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, pp 153–162 Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest QA (2020) State-of-the-art natural language processing, M. Rush. Huggingface’s transformers Yatani K, Novati M, Trusty A, Truong KN (2011) Review spotlight: a user interface for summarizing user-generated reviews using adjective-noun word pairs. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 1541–1550 Zeng H, Shu X, Wang Y, Wang Y, Zhang L, Pong T-C, Qu H (2020) Emotioncues: Emotion-oriented visual summarization of classroom videos. IEEE Trans Vis Comput Graphics Zeng H, Wang X, Wu A, Wang Y, Li Q, Endert A, Qu H (2019) Emoco: visual analysis of emotion coherence in presentation videos. IEEE Trans Visual Comput Graphics 26(1):927–937 Zhang Q, Wang Y, Gong Y, Huang X-J (2016) Keyphrase extraction using deep recurrent neural networks on twitter. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 836–845 Zhao J, Gou L, Wang F, Zhou M (2014) Pearl: an interactive visual analytic tool for understanding personal emotion style derived from social media. In: Proceedings of 2014 IEEE conference on visual analytics science and technology (VAST). IEEE, pp 203–212