Guiding the Training of Distributed Text Representation with Supervised Weighting Scheme for Sentiment Analysis

Data Science and Engineering - Tập 2 - Trang 178-186 - 2017

Zhe Zhao¹, Tao Liu¹, Shen Li², Bofang Li¹, Xiaoyong Du¹

¹School of Information, Renmin University of China, Beijing, China

²Institute of Chinese Information Processing, Beijing Normal University, Beijing, China

Tóm tắt

With the rapid growth of social media, sentiment analysis has received growing attention from both academic and industrial fields. One line of researches for sentiment analysis is to feed bag-of-words (BOW) text representation into classifiers. Usually, raw BOW requires weighting schemes to obtain better performance, where important words are given more weights while unimportant ones are given less weights. Another line of researches focuses on neural models, where distributed text representations are learned from raw texts automatically. In this paper, we take advantages of techniques in both lines of researches. We use words’ weights to guide neural models to focus on important words. Various supervised weighting schemes are explored in this work. We discover that better text features are learned for sentiment analysis when suitable weighting schemes are applied upon neural models.

Tài liệu tham khảo

Church KW, Hanks P (1990) Word association norms, mutual information, and lexicography. Comput Linguist 16(1):22–29 Dai AM, Le QV (2015) Semi-supervised sequence learning. In: Advances in neural information processing systems 28: annual conference on neural information processing systems 2015, December 7–12, Montreal, Quebec, Canada, pp 3079–3087 Deng ZH, Luo KH, Yu HL (2014) A study of supervised term weighting scheme for sentiment analysis. Expert Syst Appl 41(7):3506–3513 Denil M, Demiraj A, Kalchbrenner N, Blunsom P, de Freitas N (2014) Modelling, visualising and summarising documents with a single convolutional neural network. arXiv preprint arXiv:1406.3830 Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9:1871–1874 Goldberg Y (2016) A primer on neural network models for natural language processing. J Artif Intell Res 57:345–420 Iyyer M, Manjunatha V, Boyd-Graber JL, Daumé III H (2015) Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing of the asian federation of natural language processing, ACL 2015, July 26–31, Beijing, China, volume 1: long papers, pp 1681–1691 Johnson R, Zhang T (2015) Semi-supervised convolutional neural networks for text categorization via region embedding. In: Advances in neural information processing systems 28: annual conference on neural information processing systems 2015, December 7–12, Montreal, Quebec, Canada, pp 919–927 Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 Kim Y, Zhang O (2014) Credibility adjusted term frequency: A supervised term weighting scheme for sentiment analysis and text classification. arXiv preprint arXiv:1405.3518 Le QV, Mikolov T (2014) Distributed representations of sentences and documents. ICML 14:1188–1196 Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, December 8–13, Montreal, Quebec, Canada, pp 2177–2185 Levy O, Goldberg Y, Dagan I (2015) Improving distributional similarity with lessons learned from word embeddings. Trans Assoc Comput Linguist 3:211–225 Li B, Liu T, Du X, Zhang D, Zhao Z (2015) Learning document embeddings by predicting n-grams for sentiment classification of long movie reviews. arXiv preprint arXiv:1512.08183 Li J (2014) Feature weight tuning for recursive neural networks. arXiv preprint arXiv:1412.3714 Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Vol 1. pp 142–150. Association for Computational Linguistics Martineau J, Finin T (2009) Delta tfidf: an improved feature space for sentiment analysis. Icwsm 9:106 Mesnil G, Mikolov T, Ranzato M, Bengio Y (2014) Ensemble of generative and discriminative techniques for sentiment analysis of movie reviews. arXiv preprint arXiv:1412.5335 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013, December 5–8, Lake Tahoe, NV, USA, pp 3111–3119 Paltoglou G, Thelwall M (2010) A study of information retrieval weighting schemes for sentiment analysis. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 1386–1395. Association for Computational Linguistics Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, Vol 10, pp 79–86. Association for Computational Linguistics Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. EMNLP 14:1532–1543 Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on empirical methods in natural language processing and computational natural language learning, pp 1201–1211. Association for Computational Linguistics Wang S, Manning CD (2012) Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th annual meeting of the asssociation for computational linguistics: short papers, Vol 2, pp 90–94. Association for Computational Linguistics Zhao Z, Liu T, Hou X, Li B, Du X (2016) Distributed text representation with weighting scheme guidance for sentiment analysis. In: Asia-Pacific web conference, Springer, pp 41–52

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA