Deep learning approaches for Arabic sentiment analysis

Social Network Analysis and Mining - Tập 9 - Trang 1-12 - 2019
Ammar Mohammed1, Rania Kora1
1Department of Computer Science, Faculty of Graduate Studies for Statistical Research, Cairo University, Cairo, Egypt

Tóm tắt

Social media are considered an excellent source of information and can provide opinions, thoughts and insights toward various important topics. Sentiment analysis becomes a hot topic in research due to its importance in making decisions based on opinions derived from analyzing the user’s contents on social media. Although the Arabic language is one of the widely spoken languages used for content sharing across the social media, the sentiment analysis on Arabic contents is limited due to several challenges including the morphological structures of the language, the varieties of dialects and the lack of the appropriate corpora. Hence, the rapid increase in research in Arabic sentiment analysis is grown slowly in contrast to other languages such as English. The contribution of this paper is twofold: First, we introduce a corpus of forty thousand labeled Arabic tweets spanning several topics. Second, we present three deep learning models, namely CNN, LSTM and RCNN, for Arabic sentiment analysis. With the help of word embedding, we validate the performance of the three models on the proposed corpus. The experimental results indicate that LSTM with an average accuracy of 81.31% outperforms CNN and RCNN. Also, applying data augmentation on the corpus increases LSTM accuracy by 8.3%.

Tài liệu tham khảo

Abdul-Mageed M, Diab MT (2012) Awatif: a multi-genre corpus for modern standard arabic subjectivity and sentiment analysis. In: LREC, vol. 515. Citeseer, pp 3907–3914 Abdul-Mageed M, Diab MT, Korayem M (2011) Subjectivity and sentiment analysis of modern standard arabic. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies: short papers, vol 2. Association for Computational Linguistics, pp 587–591 Ahmed S, Pasquier M, Qadah G (2013) Key issues in conducting sentiment analysis on Arabic social media text. In: 2013 9th international conference on innovations in information technology (IIT). IEEE, pp 72–77 Alayba AM, Palade V, England M, Iqbal R (2017) Arabic language sentiment analysis on health services. In: 2017 1st International workshop on Arabic script analysis and recognition (ASAR). IEEE, pp 114–118 Alayba AM, Palade V, England M, Iqbal R (2018) A combined cnn and lstm model for arabic sentiment analysis. In: International cross-domain conference for machine learning and knowledge extraction. Springer, pp 179–191 Albraheem L, Al-Khalifa HS (2012) Exploring the problems of sentiment analysis in informal arabic. In: Proceedings of the 14th international conference on information integration and web-based applications & services. ACM, pp 415–418 Aldayel HK, Azmi AM (2016) Arabic tweets sentiment analysis-a hybrid scheme. J Inf Sci 42(6):782–797 Alomari KM, ElSherif HM, Shaalan K (2017) Arabic tweets sentimental analysis using machine learning. In: International conference on industrial, engineering and other applications of applied intelligent systems. Springer, pp 602–610 AlOtaibi S, Khan MB (2017) Sentiment analysis challenges of informal arabic. Int J Adv Comput Sci Appl 8(2):278–284 Alshuaibi ASI, Mohd Shamsudin F, Alshuaibi MSI (2015) Internet misuse at work in jordan: challenges and implications. In: Proceedings of the 3rd convention of the world association of business schools (WAiBS), pp 68–78 Altaher A (2017) Hybrid approach for sentiment analysis of arabic tweets based on deep learning model and features weighting. Int J Adv Appl Sci 4(8):43–49 Baly R, El-Khoury G, Moukalled R, Aoun R, Hajj H, Shaban KB, El-Hajj W (2017) Comparative evaluation of sentiment analysis methods across arabic dialects. Proc Comput Sci 117:266–273 Chen Y, Yuan J, You Q, Luo J (2018) Twitter sentiment analysis via bi-sense emoji embedding and attention-based lstm. In: 2018 ACM multimedia conference on multimedia conference. ACM, pp 117–125 Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning. ACM, pp 160–167 Dahou A, Xiong S, Zhou J, Haddoud MH, Duan P (2016) Word embeddings and convolutional neural network for arabic sentiment classification. In: Proceedings of coling 2016, the 26th international conference on computational linguistics: technical papers, pp 2418–2427 Duwairi RM, Marji R, Sha’ban N, Rushaidat S (2014) Sentiment analysis in Arabic tweets. In: 2014 5th international conference on information and communication systems (ICICS). IEEE, pp 1–6 Eigen D, Rolfe J, Fergus R, LeCun Y (2013) Understanding deep architectures using a recursive convolutional network. arXiv preprint arXiv:1312.1847 El-Beltagy SR, Ali A (2013) Open issues in the sentiment analysis of arabic social media: a case study. In: 2013 9th international conference on innovations in information technology (IIT). IEEE, pp 215–220 Farghaly A, Shaalan K (2009) Arabic natural language processing: challenges and solutions. ACM Trans Asian Lang Inf Process (TALIP) 8(4):14 Fawcett T (2006) An introduction to roc analysis. Pattern Recognit Lett 27(8):861–874 Gantz J, Reinsel D (2011) The 2011 digital universe study: extracting value from chaos. Sponsored by EMC corporation, IDC Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587 Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd international conference on Machine learning. ACM, pp 369–376 Han J, Moraga C (1995) The influence of the sigmoid function parameters on the speed of backpropagation learning. In: International workshop on artificial neural networks. Springer, pp 195–201 Hassan A, Amin MR, Al Azad AK, Mohammed N (2016) Sentiment analysis on bangla and romanized bangla text using deep recurrent models. In: 2016 international workshop on computational intelligence (IWCI). IEEE, pp 51–56 Heikal M, Torki M, El-Makky N (2018) Sentiment analysis of arabic tweets using deep learning. Proc Comput Sci 142:114–122 Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 Huang M, Cao Y, Dong C (2016) Modeling rich contexts for sentiment classification with lstm. arXiv preprint arXiv:1605.01478 Jungiewcz M, Smywinski-Pohl A (2019) Towards textual data augmentation for neural networks: synonyms and maximum loss. Comput Sci 20(1):57–83. https://doi.org/10.7494/csci.2019.20.1.3023 Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 Kobayashi S (2018) Contextual augmentation: data augmentation by words with paradigmatic relations. arXiv preprint arXiv:1805.06201 Kora R, Mohammed A (2019) Corpus on Arabic Egyptian tweets. https://doi.org/10.7910/DVN/LBXV9O Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Twenty-ninth AAAI conference on artificial intelligence Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119 Mourad A, Darwish K (2013) Subjectivity and sentiment analysis of modern standard arabic and arabic microblogs. In: Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis, pp 55–64 Nabil M, Aly M, Atiya A (2015) Astd: Arabic sentiment tweets dataset. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 2515–2519 Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543 Pontiki M, Galanis D, Papageorgiou H, Androutsopoulos I, Manandhar S, Mohammad AS, Al-Ayyoub M, Zhao Y, Qin B, De Clercq O, et al. (2016) Semeval-2016 task 5: aspect based sentiment analysis. In: Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), pp 19–30 Rao A, Spasojevic N (2016) Actionable and political text classification using word embeddings and lstm. arXiv preprint arXiv:1607.02501 Ravi K, Ravi V (2015) A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowl Based Syst 89:14–46 Salamah JB, Elkhlifi A (2014) Microblogging opinion mining approach for kuwaiti dialect. In: The International conference on computing technology and information management (ICCTIM). Society of Digital Information and Wireless Communication, p 388 Shoukry A, Rafea A (2012) Sentence-level arabic sentiment analysis. In: 2012 international conference on collaboration technologies and systems (CTS). IEEE, pp 546–550 Soliman AB, Eissa K, El-Beltagy SR (2017) Aravec: a set of arabic word embedding models for use in arabic nlp. Proc Comput Sci 117:256–265 Thelwall M, Buckley K, Paltoglou G (2011) Sentiment in twitter events. J Am Soc Inf Sci Technol 62(2):406–418 Vilares D, Alonso MA, Gómez-Rodríguez C (2017) Supervised sentiment analysis in multilingual environments. Inf Process Manag 53(3):595–607 Vizcarra G, Mauricio A, Mauricio L (2018) A deep learning approach for sentiment analysis in spanish tweets. In: International conference on artificial neural networks. Springer, pp 622–629 Wang Y, Huang M, Zhao L, et al. (2016) Attention-based lstm for aspect-level sentiment classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 606–615 Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1253