BiGRU-ANN based hybrid architecture for intensified classification tasks with explainable AI
Tóm tắt
Artificial Intelligence (AI) is increasingly being employed in critical decision-making processes such as medical diagnosis, credit approval, criminal justice, and many more. However, many AI models exploit complex algorithms that are difficult for humans to see through, which can lead to concerns about accountability, bias, and the ability to trust the outcomes. With the increasing demand for AI systems to be transparent, interpretable, and reliable, the field of Explainable AI (XAI) has gained attention of the researchers. This paper presents a robust hybrid architecture that combines Bidirectional Gated Recurrent Units (BiGRU) and Artificial Neural Networks (ANN) for the classification of texts and sentiment analysis. Interpretable Model Agnostic Explanation (LIME) has been employed with our proposed model to enhance confidence in the outcomes. The proposed architecture is found to be effective for sentiment analysis from texts, and classifying images containing handwrit- ten characters. It leverages the BiGRU to model the sequential dependencies in the data, while the ANN is used for the final classification. Evaluations on both Bengali and English datasets show that the proposed architecture outperforms state-of-the-art models in various performance metrics, providing meaningful and interpretable explanations for its predictions. The model can be used in systems that require the architectures to be computationally less demanding, yet a decent accuracy is secured.
Tài liệu tham khảo
Arias F, Núñez MZ, Guerra-Adames A, Tejedor-Flores N, Vargas Lombardo M (2022) Sentiment analysis of public social media as a tool for health-related topics. IEEE Access 10:74850–74872
Li W, Shao W, Ji S, Cambria E (2022) BiERU: bidirectional emotional recurrent unit for conversational sentiment analysis. Neurocomputing 467:73–82
Yu LC, Lee CW, Pan HI, Chou CY, Chao PY, Chen ZH et al (2018) Improving early prediction of academic failure using sentiment analysis on self-evaluated comments. J Comput Assist Learn 34(4):358–365
Xu G, Meng Y, Qiu X, Yu Z, Wu X (2019) Sentiment analysis of comment texts based on BiLSTM. Ieee Access 7:51522–51532
Hassan A, Amin MR, Al Azad AK and Mohammed N (2016) Sentiment analysis on bangla and romanized bangla text using deep recurrent models. In: 2016 International Workshop on Computational Intelligence (IWCI) (pp 51–56). IEEE
Yao Y, Sullivan T IV, Yan F, Gong J, Li L (2022) Balancing data for generalizable machine learning to predict glass-forming ability of ternary alloys. Scripta Mater 209:114366
Tuhin RA, Paul BK, Nawrine F, Akter M and Das AK (2019) An automated system of sentiment analysis from Bangla text using supervised learning techniques. In 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS) (pp 360– 364). IEEE
Abbasimehr H, Paki R, Bahrini A (2022) A novel approach based on combining deep learning models with statistical methods for COVID- 19 time series forecasting. Neural Comput Appl 34(4):3135–3149
Dollen DV, Neukart F, Weimer D, Bäck T (2023) Predicting vehicle prices via quantum-assisted feature selection. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01370-z
Yan W, Zhou L, Qian Z, Xiao L, Zhu H (2021) Sentiment analysis of student texts using the CNN-BiGRU-AT model. Sci Program 2021:1–9
Kobayashi M, Nakaji K, Yamamoto N (2022) Overfitting in quantum machine learning and entangling dropout. Quant Mach Intell 4(2):1–9
Chiong R, Fan Z, Hu Z, Dhakal S (2022) A novel ensemble learning approach for stock market prediction based on sentiment analysis and the sliding window method. IEEE Trans Comput Soc Syst. https://doi.org/10.1109/TCSS.2022.3182375
Kalarani P, Selva Brunda S (2019) Sentiment analysis by POS and joint sentiment topic features using SVM and ANN. Soft Comput 23:7067–7079
Mikolov T, Grave E, Bojanowski P, Puhrsch C & Joulin A (2017) Advances in pre-training distributed word representations. arXiv preprint. arXiv:1712.09405
Wadud MAH, Kabir MM, Mridha MF, Ali MA, Hamid MA, Monowar MM (2022) How can we manage offensive text in social media-a text classification approach using LSTM-BOOST. Int J Inf Manage Data Insights 2(2):100095
Faruque MA, Rahman S, Chakraborty P, Choudhury T, Um JS, Singh TP (2022) Ascertaining polarity of public opinions on Bangladesh cricket using machine learning techniques. Spatial Inf Res 30(1):1–8
Shamrat FMJM, Chakraborty S, Imran MM, Muna JN, Billah MM, Das P, Rahman OM (2021) Sentiment analysis on twitter tweets about COVID-19 vaccines using NLP and supervised KNN classification algorithm. Indonesian J Electric Eng Comput Sci 23(1):463–470
Chinnasamy P, Suresh V, Ramprathap K, Jebamani BJA, Rao KS, Kranthi MS (2022) COVID-19 vaccine sentiment analysis using public opinions on Twitter. Mater Today 64:448–451
Lazaridis A, Vlahavas I (2023) GENEREIT: generating multi- talented reinforcement learning agents. Int J Inf Technol 15(2):643–650
Seyyar YE, Yavuz AG, Ünver HM (2022) An attack detection framework based on BERT and deep learning. IEEE Access 10:68633–68644
Madsen A, Reddy S, Chandar S (2022) Post-hoc interpretability for neural NLP: a survey. ACM Comput Surv 55(8):1–42
Lauriola I, Lavelli A, Aiolli F (2022) An introduction to deep learning in natural language processing: models, techniques, and tools. Neurocomputing 470:443–456
Sharif O, Hoque MM & Hossain E (2019) Sentiment analysis of Bengali texts on online restaurant reviews using multinomial Näıve Bayes. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT) (pp 1–6). IEEE
Abdelgwad MM, Soliman THA, Taloba AI, Farghaly MF (2022) Arabic aspect based sentiment analysis using bidirectional LSTM based models. J King Saud Univ-Comput Inf Sci 34(9):6652–6662
Saleh H, Mostafa S, Gabralla LA, Aseeri AO, El-Sappagh S (2022) Enhanced Arabic sentiment analysis using a novel stacking ensemble of hybrid and deep learning models. Appl Sci 12(18):8967
Tripathi K, Khan FA, Khanday AMUD, Nisa KU (2023) The classification of medical and botanical data through majority voting using artificial neural network. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01361-0
Youbi F, Settouti N (2022) Analysis of machine learning and deep learning frameworks for opinion mining on drug reviews. Comput J 65(9):2470–2483
Swathi T, Sudha S (2023) Crop classification and prediction based on soil nutrition using machine learning methods. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01345-0
Kowsher M, Tahabilder A, Sanjid MZI, Prottasha NJ, Uddin MS, Hossain MA, Jilani MAK (2021) LSTM-ANN & BiLSTM-ANN: hybrid deep learning models for enhanced classification accuracy. Procedia Comput Sci 193:131–140
Kowsher M, Sami AA, Prottasha NJ, Arefin MS, Dhar PK, Koshiba T (2022) Bangla-BERT: transformer-based efficient model for transfer learning and language understanding. IEEE Access 10:91855–91870
Habbat N, Anoun H, Hassouni L (2022) Combination of GRU and CNN deep learning models for sentiment analysis on French customer reviews using XLNet model. IEEE Eng Manage Rev 51(1):41–51
Zhou X, Ma R, Zou Y, Chen X, Gui T, Zhang Q et al. (2022) Making parameter-efficient tuning more efficient: a unified framework for classification tasks. In: Proceedings of the 29th International Conference on Computational Linguistics. pp 7053–7064
Lindauer M, Eggensperger K, Feurer M, Biedenkapp A, Deng D, Benjamins C et al (2022) SMAC3: a versatile Bayesian optimization package for hyperparameter optimization. J Mach Learn Res 23:54–61
Alam F, Hasan A, Alam T, Khan A, Tajrin J, Khan N & Chowd hury SA (2021) A review of Bangla natural language processing tasks and the utility of transformer models. arXiv preprint arXiv:2107.03844
Sakiba SN, Shuvo MMU, Hossain N, Das SK, Mela JD & Islam MA (2021) A memory-efficient tool for Bengali parts of speech tagging. In: Artificial intelligence techniques for advanced computing applications: proceedings of ICACT 2020. Springer Singapore, pp 67–78
Wadud MAH, Mridha M, Rahman MM (2022) Word embedding methods for word representation in deep learning for natural language processing. Iraqi J Sci 63:1349–1361
Chakraborty S, Talukdar MBU, Adib MYM, Mitra S & Alam MGR (2022) LSTM-ANN based price hike sentiment analysis from Bangla social media comments. In: 2022 25th International Conference on Computer and Information Technology (ICCIT) (pp 733–738). IEEE
Mahdaoui AE, Ouahabi A, Moulay MS (2022) Image denoising using a compressive sensing approach based on regularization constraints. Sensors 22(6):2199
Timofte R, Tuytelaars T & Van Gool L (2012) Naive bayes image classification: beyond nearest neighbors. In: Asian Conference on Computer Vision. Berlin: Springer, pp 689–703
Subramanian AS, Weng C, Watanabe S, Yu M, Yu D (2022) Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition. Comput Speech Lang 75:101360
Abiodun OI, Jantan A, Omolara AE, Dada KV, Umar AM, Linus OU et al (2019) Comprehensive review of artificial neural network applications to pattern recognition. IEEE Access 7:158820–158846
Jiang P, Suzuki H, Obi T (2023) XAI-based cross-ensemble feature ranking methodology for machine learning models. Int J Inf Technol 15(4):1759–1768
Dieber J & Kirrane S (2020) Why model why? Assessing the strengths and limitations of LIME. arXiv preprint arXiv:2012.00093.
Das A & Rad P (2020) Opportunities and challenges in explainable artificial intelligence (xai): a survey. arXiv preprint arXiv:2006.11371.