An empirical study of sentiment analysis utilizing machine learning and deep learning algorithms

Journal of Computational Social Science - Trang 1-17 - 2023

Betul Erkantarci¹, Gokhan Bakal¹

¹Department of Computer Engineering, Abdullah Gul University, Kayseri, Turkey

Tóm tắt

Among text-mining studies, one of the most studied topics is the text classification task applied in various domains, including medicine, social media, and academia. As a sub-problem in text classification, sentiment analysis has been widely investigated to classify often opinion-based textual elements. Specifically, user reviews and experiential feedback for products or services have been employed as fundamental data sources for sentiment analysis efforts. As a result of rapidly emerging technological advancements, social media platforms such as Twitter, Facebook, and Reddit, have become central opinion-sharing mediums since the early 2000s. In this sense, we build various machine-learning models to solve the sentiment analysis problem on the Reddit comments dataset in this work. The experimental models we constructed achieve F1 scores within intervals of 73–76%. Consequently, we present comparative performance scores obtained by traditional machine learning and deep learning models and discuss the results.

Tài liệu tham khảo

Al Amrani, Y., Lazaar, M., & El Kadiri, K. E. (2018). Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Computer Science, 127, 511–520. Arias, M., Arratia, A., & Xuriguera, R. (2014). Forecasting with twitter data. ACM Transactions on Intelligent Systems and Technology (TIST), 5(1), 1–24. Bakal, G., & Kavuluru, R. (2017). On quantifying diffusion of health information on twitter. In 2017 IEEE EMBS international conference on biomedical & health informatics (BHI) (pp. 485–488). https://doi.org/10.1109/BHI.2017.7897311 Bakal, G., Talari, P., Kakani, E. V., et al. (2018). Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations. Journal of Biomedical Informatics, 82, 189–199. Dang, N. C., Moreno-García, M. N., & De la Prieta, F. (2020). Sentiment analysis based on deep learning: A comparative study. Electronics, 9(3), 483. Diwali, A., Dashtipour, K., Saeedi, K., et al. (2022). Arabic sentiment analysis using dependency-based rules and deep neural networks. Applied Soft Computing, 127(109), 377. Elghazaly, T., Mahmoud, A., & Hefny, H. A. (2016). Political sentiment analysis using twitter data. In Proceedings of the international conference on internet of things and cloud computing (pp. 1–5). Gers, F. A., Schraudolph, N. N., & Schmidhuber, J. (2002). Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research, 3(Aug), 115–143. Gowda, C., Anirudh, Pai, A., et al. (2019). Twitter and reddit sentimental analysis dataset. https://doi.org/10.34740/KAGGLE/DS/429085. Gulati, K., Kumar, S. S., Boddu, R. S. K., et al. (2022). Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to covid-19 pandemic. Materials Today: Proceedings, 51, 38–41. Hidayat, T. H. J., Ruldeviyani, Y., Aditama, A. R., et al. (2022). Sentiment analysis of twitter data related to Rinca island development using doc2vec and svm and logistic regression as classifier. Procedia Computer Science, 197, 660–667. Jiang, T., Gradus, J. L., & Rosellini, A. J. (2020). Supervised machine learning: A brief primer. Behavior Therapy, 51(5), 675–687. Lee, V. L. S., Gan, K. H., Tan, T. P., et al. (2019). Semi-supervised learning for sentiment classification using small number of labeled data. Procedia Computer Science, 161, 577–584. Pedregosa, F., Varoquaux, G., Gramfort, A., et al. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning research, 12, 2825–2830. Punetha, N., & Jain, G. (2023). Bayesian game model based unsupervised sentiment analysis of product reviews. Expert Systems with Applications, 214(119), 128. Ranjan, M. N. M., Ghorpade, Y., Kanthale, G., et al. (2017). Document classification using LSTM neural network. Journal of Data Mining and Management, 2(2), 1–9. Shah, K., Patel, H., Sanghvi, D., et al. (2020). A comparative analysis of logistic regression, random forest and KNN models for the text classification. Augmented Human Research, 5(1), 1–16. Shaik, T., Tao, X., Dann, C., et al. (2022). Sentiment analysis and opinion mining on educational data: A survey. Natural Language Processing Journal, 2, 100003. Vashishtha, S., & Susan, S. (2019). Fuzzy rule based unsupervised sentiment analysis from social media posts. Expert Systems with Applications, 138(112), 834. Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998–6008). Verma, S. (2022). Sentiment analysis of public services for smart society: Literature review and future research directions. Government Information Quarterly, 101708. Yazdani, A., Safdari, R., Golkar, A., et al. (2019). Words prediction based on n-gram model for free-text entry in electronic health records. Health Information Science and Systems, 7(1), 1–7. Ye, Q., Zhang, Z., & Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications, 36(3), 6527–6535. Zeiler, M. D., Krishnan, D., Taylor, G. W., et al. (2010). Deconvolutional networks. In 2010 IEEE computer society conference on computer vision and pattern recognition (pp. 2528–2535). IEEE. Zhang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1253.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA