A review: preprocessing techniques and data augmentation for sentiment analysis
Tóm tắt
Từ khóa
Tài liệu tham khảo
Hussein DME-DM. A survey on sentiment analysis challenges. J King Saud Univ Eng Sci. 2018;30(4):330–8.
Medhat W, Hassan A, Korashy H. Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J. 2014;5(4):1093–113.
Soleymani M, Garcia D, Jou B, Schuller B, Chang S-F, Pantic M. A survey of multimodal sentiment analysis. Image Vis Comput. 2017;65:3–14.
Symeonidis S, Effrosynidis D, Arampatzis A. A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis. Expert Syst Appl. 2018;110:298–310.
Effrosynidis D, Symeonidis S, Arampatzis A. A Comparison of Pre-processing Techniques for Twitter Sentiment Analysis. In: Kamps J., Tsakonas G., Manolopoulos Y., Iliadis L., Karydis I. (eds) Research and Advanced Technology for Digital Libraries. TPDL. Lecture Notes in Computer Science, vol. 10450. Cham: Springer; 2017.
Fernández-Gavilanes M, Àlvarez-López T, Juncal-Martínez J, Costa-Montenegro E, González-Castaño FJ. “GTI: An Unsupervised Approach for Sentiment Analysis in Twitter,” in Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver; 2015. pp. 533–538.
Singh T, Kumari M. Role of text pre-processing in Twitter sentiment analysis. Procedia Comp Sci. 2016;89:549–54. https://doi.org/10.1016/j.procs.2016.06.095.
Jianqiang Z, Xiaolin G. Comparison research on text pre-processing methods on Twitter sentiment analysis. IEEE Access. 2017;5:2870–9. https://doi.org/10.1109/ACCESS.2017.2672677.
AL-Sharuee MT, Liu F, Pratama M. Sentiment analysis: an automatic contextual analysis and ensemble clustering approach and comparison. Data Knowl Eng. 2018;115:194–213.
Fernández-Gavilanes M, Juncal-Martínez J, García-Méndez S, Costa-Montenegro E, González-Castaño FJ. Creating emoji lexica from unsupervised sentiment analysis of their descriptions. Expert Syst Appl. 2018;103:74–91.
Wang H, Castanon JA. “Sentiment expression via emoticons on social media,” 2015 IEEE International Conference on Big Data (Big Data), Santa Clara. 2015; pp. 2404-2408, https://doi.org/10.1109/BigData.2015.7364034.
Sennrich R, Haddow B, Birch A. “Improving Neural Machine Translation Models with Monolingual Data,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol 1: Long Papers, Berlin. 2016; pp. 86–96, https://doi.org/10.18653/v1/P16-1009.
Sugiyama A, Yoshinaga N. “Data augmentation using back-translation for context-aware neural machine translation,” in Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019), Hong Kong. 2019; pp. 35–44, https://doi.org/10.18653/v1/D19-6504.
Fadaee M, Bisazza A, Monz C. “Data Augmentation for Low-Resource Neural Machine Translation,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vol 2: Short Papers. Vancouver. 2017; pp. 567–573, https://doi.org/10.18653/v1/P17-2090.
Kobayashi S. “Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), New Orleans. 2018; pp. 452–457.
Azad HK, Deepak A. Query expansion techniques for information retrieval: a survey. Inf Process Manage. 2019;56(5):1698–735.
Şahin GG, Steedman M. “Data Augmentation via Dependency Tree Morphing for Low-Resource Languages,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels. 2018; pp. 5004–5009. https://doi.org/10.18653/v1/D18-1545.
Wei J, Zou K. “EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification,” in ICLR 2019-7th International Conference on Learning Representations, 2019.
Kim K. An improved semi-supervised dimensionality reduction using feature weighting: application to sentiment analysis. Expert Syst Appl. 2018;109:49–65.
Nguyen-Thi BT, Duong HT. A Vietnamese sentiment analysis system based on multiple classifiers with enhancing lexicon features. In: Duong T, Vo NS, Nguyen L, Vien QT, Nguyen VD, editors. Industrial networks and intelligent systems INISCOM, vol. 293., Lecture notes of the Institute for Computer Sciences, Social Informatics and Telecommunications EngineeringCham: Springer; 2019.
Nguyen-Nhat D-K, Duong H-T. One-Document Training for Vietnamese Sentiment Analysis. In: Tagarelli A, Tong H, editors. Computational Data and Social Networks, vol. 11917. Cham: Springer International Publishing; 2019. p. 189–200.
Xia R, Xu F, Zong C, Li Q, Qi Y, Li T. Dual sentiment analysis: considering two sides of one review. IEEE Trans Knowl Data Eng. 2015;27(8):2120–33. https://doi.org/10.1109/TKDE.2015.2407371.
Xia M, Kong X, Anastasopoulos A, Neubig G. Generalized Data Augmentation for Low-Resource Translation, in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence. 2019; pp. 5786–5796. https://doi.org/10.18653/v1/P19-1579.