BiGRU-ANN based hybrid architecture for intensified classification tasks with explainable AI

International Journal of Information Technology - Tập 15 - Trang 4211-4221 - 2023
Sovon Chakraborty1, Muhammad Borhan Uddin Talukder2, Mohammad Mehadi Hasan3, Jannatun Noor4, Jia Uddin5
1Department of Computer Science and Engineering, University of Liberal Arts Bangladesh, Dhaka, Bangladesh
2Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
3Department of Computer Science and Engineering, European University of Bangladesh, Dhaka, Bangladesh
4Department of Computer Science and Engineering, Brac University, Dhaka, Bangladesh
5AI and Big Data Department, Woosong University, Daejeon, Korea

Tóm tắt

Artificial Intelligence (AI) is increasingly being employed in critical decision-making processes such as medical diagnosis, credit approval, criminal justice, and many more. However, many AI models exploit complex algorithms that are difficult for humans to see through, which can lead to concerns about accountability, bias, and the ability to trust the outcomes. With the increasing demand for AI systems to be transparent, interpretable, and reliable, the field of Explainable AI (XAI) has gained attention of the researchers. This paper presents a robust hybrid architecture that combines Bidirectional Gated Recurrent Units (BiGRU) and Artificial Neural Networks (ANN) for the classification of texts and sentiment analysis. Interpretable Model Agnostic Explanation (LIME) has been employed with our proposed model to enhance confidence in the outcomes. The proposed architecture is found to be effective for sentiment analysis from texts, and classifying images containing handwrit- ten characters. It leverages the BiGRU to model the sequential dependencies in the data, while the ANN is used for the final classification. Evaluations on both Bengali and English datasets show that the proposed architecture outperforms state-of-the-art models in various performance metrics, providing meaningful and interpretable explanations for its predictions. The model can be used in systems that require the architectures to be computationally less demanding, yet a decent accuracy is secured.

Tài liệu tham khảo

Arias F, Núñez MZ, Guerra-Adames A, Tejedor-Flores N, Vargas Lombardo M (2022) Sentiment analysis of public social media as a tool for health-related topics. IEEE Access 10:74850–74872 Li W, Shao W, Ji S, Cambria E (2022) BiERU: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing 467:73–82 Yu LC, Lee CW, Pan HI, Chou CY, Chao PY, Chen ZH et al (2018) Improving early prediction of academic failure using sentiment analysis on self-evaluated comments. J Comput Assist Learn 34(4):358–365 Xu G, Meng Y, Qiu X, Yu Z, Wu X (2019) Sentiment analysis of comment texts based on BiLSTM. Ieee Access 7:51522–51532 Hassan A, Amin MR, Al Azad AK and Mohammed N (2016) Sentiment analysis on bangla and romanized bangla text using deep recurrent models. In: 2016 International Workshop on Computational Intelligence (IWCI) (pp 51–56). IEEE Yao Y, Sullivan T IV, Yan F, Gong J, Li L (2022) Balancing data for generalizable machine learning to predict glass-forming ability of ternary alloys. Scripta Mater 209:114366 Tuhin RA, Paul BK, Nawrine F, Akter M and Das AK (2019) An automated system of sentiment analysis from Bangla text using supervised learning techniques. In 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS) (pp 360– 364). IEEE Abbasimehr H, Paki R, Bahrini A (2022) A novel approach based on combining deep learning models with statistical methods for COVID- 19 time series forecasting. Neural Comput Appl 34(4):3135–3149 Dollen DV, Neukart F, Weimer D, Bäck T (2023) Predicting vehicle prices via quantum-assisted feature selection. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01370-z Yan W, Zhou L, Qian Z, Xiao L, Zhu H (2021) Sentiment analysis of student texts using the CNN-BiGRU-AT model. Sci Program 2021:1–9 Kobayashi M, Nakaji K, Yamamoto N (2022) Overfitting in quantum machine learning and entangling dropout. Quant Mach Intell 4(2):1–9 Chiong R, Fan Z, Hu Z, Dhakal S (2022) A novel ensemble learning approach for stock market prediction based on sentiment analysis and the sliding window method. IEEE Trans Comput Soc Syst. https://doi.org/10.1109/TCSS.2022.3182375 Kalarani P, Selva Brunda S (2019) Sentiment analysis by POS and joint sentiment topic features using SVM and ANN. Soft Comput 23:7067–7079 Mikolov T, Grave E, Bojanowski P, Puhrsch C & Joulin A (2017) Advances in pre-training distributed word representations. arXiv preprint. arXiv:1712.09405 Wadud MAH, Kabir MM, Mridha MF, Ali MA, Hamid MA, Monowar MM (2022) How can we manage offensive text in social media-a text classification approach using LSTM-BOOST. Int J Inf Manage Data Insights 2(2):100095 Faruque MA, Rahman S, Chakraborty P, Choudhury T, Um JS, Singh TP (2022) Ascertaining polarity of public opinions on Bangladesh cricket using machine learning techniques. Spatial Inf Res 30(1):1–8 Shamrat FMJM, Chakraborty S, Imran MM, Muna JN, Billah MM, Das P, Rahman OM (2021) Sentiment analysis on twitter tweets about COVID-19 vaccines using NLP and supervised KNN classification algorithm. Indonesian J Electric Eng Comput Sci 23(1):463–470 Chinnasamy P, Suresh V, Ramprathap K, Jebamani BJA, Rao KS, Kranthi MS (2022) COVID-19 vaccine sentiment analysis using public opinions on Twitter. Mater Today 64:448–451 Lazaridis A, Vlahavas I (2023) GENEREIT: generating multi- talented reinforcement learning agents. Int J Inf Technol 15(2):643–650 Seyyar YE, Yavuz AG, Ünver HM (2022) An attack detection framework based on BERT and deep learning. IEEE Access 10:68633–68644 Madsen A, Reddy S, Chandar S (2022) Post-hoc interpretability for neural NLP: a survey. ACM Comput Surv 55(8):1–42 Lauriola I, Lavelli A, Aiolli F (2022) An introduction to deep learning in natural language processing: models, techniques, and tools. Neurocomputing 470:443–456 Sharif O, Hoque MM & Hossain E (2019) Sentiment analysis of Bengali texts on online restaurant reviews using multinomial Näıve Bayes. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT) (pp 1–6). IEEE Abdelgwad MM, Soliman THA, Taloba AI, Farghaly MF (2022) Arabic aspect based sentiment analysis using bidirectional LSTM based models. J King Saud Univ-Comput Inf Sci 34(9):6652–6662 Saleh H, Mostafa S, Gabralla LA, Aseeri AO, El-Sappagh S (2022) Enhanced Arabic sentiment analysis using a novel stacking ensemble of hybrid and deep learning models. Appl Sci 12(18):8967 Tripathi K, Khan FA, Khanday AMUD, Nisa KU (2023) The classification of medical and botanical data through majority voting using artificial neural network. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01361-0 Youbi F, Settouti N (2022) Analysis of machine learning and deep learning frameworks for opinion mining on drug reviews. Comput J 65(9):2470–2483 Swathi T, Sudha S (2023) Crop classification and prediction based on soil nutrition using machine learning methods. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01345-0 Kowsher M, Tahabilder A, Sanjid MZI, Prottasha NJ, Uddin MS, Hossain MA, Jilani MAK (2021) LSTM-ANN & BiLSTM-ANN: hybrid deep learning models for enhanced classification accuracy. Procedia Comput Sci 193:131–140 Kowsher M, Sami AA, Prottasha NJ, Arefin MS, Dhar PK, Koshiba T (2022) Bangla-BERT: transformer-based efficient model for transfer learning and language understanding. IEEE Access 10:91855–91870 Habbat N, Anoun H, Hassouni L (2022) Combination of GRU and CNN deep learning models for sentiment analysis on French customer reviews using XLNet model. IEEE Eng Manage Rev 51(1):41–51 Zhou X, Ma R, Zou Y, Chen X, Gui T, Zhang Q et al. (2022) Making parameter-efficient tuning more efficient: a unified framework for classification tasks. In: Proceedings of the 29th International Conference on Computational Linguistics. pp 7053–7064 Lindauer M, Eggensperger K, Feurer M, Biedenkapp A, Deng D, Benjamins C et al (2022) SMAC3: a versatile Bayesian optimization package for hyperparameter optimization. J Mach Learn Res 23:54–61 Alam F, Hasan A, Alam T, Khan A, Tajrin J, Khan N & Chowd hury SA (2021) A review of Bangla natural language processing tasks and the utility of transformer models. arXiv preprint arXiv:2107.03844 Sakiba SN, Shuvo MMU, Hossain N, Das SK, Mela JD & Islam MA (2021) A memory-efficient tool for Bengali parts of speech tagging. In: Artificial intelligence techniques for advanced computing applications: proceedings of ICACT 2020. Springer Singapore, pp 67–78 Wadud MAH, Mridha M, Rahman MM (2022) Word embedding methods for word representation in deep learning for natural language processing. Iraqi J Sci 63:1349–1361 Chakraborty S, Talukdar MBU, Adib MYM, Mitra S & Alam MGR (2022) LSTM-ANN based price hike sentiment analysis from Bangla social media comments. In: 2022 25th International Conference on Computer and Information Technology (ICCIT) (pp 733–738). IEEE Mahdaoui AE, Ouahabi A, Moulay MS (2022) Image denoising using a compressive sensing approach based on regularization constraints. Sensors 22(6):2199 Timofte R, Tuytelaars T & Van Gool L (2012) Naive bayes image classification: beyond nearest neighbors. In: Asian Conference on Computer Vision. Berlin: Springer, pp 689–703 Subramanian AS, Weng C, Watanabe S, Yu M, Yu D (2022) Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition. Comput Speech Lang 75:101360 Abiodun OI, Jantan A, Omolara AE, Dada KV, Umar AM, Linus OU et al (2019) Comprehensive review of artificial neural network applications to pattern recognition. IEEE Access 7:158820–158846 Jiang P, Suzuki H, Obi T (2023) XAI-based cross-ensemble feature ranking methodology for machine learning models. Int J Inf Technol 15(4):1759–1768 Dieber J & Kirrane S (2020) Why model why? Assessing the strengths and limitations of LIME. arXiv preprint arXiv:2012.00093. Das A & Rad P (2020) Opportunities and challenges in explainable artificial intelligence (xai): a survey. arXiv preprint arXiv:2006.11371.