Mutual character dialogue generation with semi-supervised multitask learners and awareness

Ayesheh Ahrari Khalaf1, Aisha Hassan Abdalla Hashim1, Akeem Olowolayemo2
1Department of Electrical and Computer Engineering, Faculty of Engineering, International Islamic University Malaysia (IIUM), Kuala Lumpur, Malaysia
2Department of Computer Science, Faculty of Information and Communication Technology, International Islamic University Malaysia (IIUM), Kuala Lumpur, Malaysia

Tóm tắt

Consistent efforts have been ongoing to improve the friendliness and reliability of informal dialogue systems. However, most research focuses solely on mimicking human-like answers. Therefore, the interlocutors’ awareness features of the dialogue system are left unexplored. Meanwhile, cognitive science research reveals that awareness is a crucial indicator of an effective, high-quality informal conversation. This research aims to boost the quality of the conversational generation system by factoring in awareness of the interlocutors in the design and training of the dialogue system model. The Generative Pre-Trained Transformer-2 (GPT-2) model was implemented into the Persona Perception (P2) Bot to achieve the objectives of this study. This was to precisely develop model's understanding, P2 Bot was implemented using a transmitter–receiver-based structure. The P2 Bot leverages mutual persona awareness to improve the quality of customized dialogue generation. GPT-2 is a 1.5B parameter transformer model that produces state-of-the-art accuracy in a zero-shot setting on seven of the eight evaluated language modeling datasets. The observations of the proposed model on a sizable open-source dataset, PERSONA-CHAT, proved successful, with improvement above the state-of-the-art baselines in both automatic measures and human assessments. The model has achieved 82.2% accuracy on Hits@1 performance metrics in the original data and 68.8% on the revised data. On the human evaluation, the model scored an average of 2.66, pointing out that the responses provided were coherent and informative. A dialogue generation model with character and awareness which can communicate like an informative human expert was introduced. This study presents the submerging of GPT-2 model on a mutual persona perception dialogue generating model.

Tài liệu tham khảo

Schatzmann J, Weilhammer K, Stuttle M, Young S (2006) A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. Knowl Eng Rev 21(2):97–126. https://doi.org/10.1017/S0269888906000944 Shang L, Lu Z, Li H (2015) Neural responding machine for short-text conversation. arXiv preprint arXiv:1503.02364. https://doi.org/10.48550/arXiv.1503.02364 Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. Adv Neural Inf Process Syst 27:1–9 Song Y, Yan R, Li X, Zhao D, Zhang M (2016) Two are better than one: an ensemble of retrieval-and generation-based dialog systems. arXiv preprint arXiv:1610.07149. https://doi.org/10.48550/arXiv.1610.07149 Ritter A, Cherry C, Dolan B (2011) Data-driven response generation in social media. In: Empirical Methods in Natural Language Processing (EMNLP) Vinyals O, Le Q (2015) A neural conversational model. arXiv preprint arXiv:1506.05869 Serban I, Sordoni A, Bengio Y, Courville A, Pineau J (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 30, No. 1). https://doi.org/10.1609/aaai.v30i1.9883 Song H, Zhang WN, Cui Y, Wang D, Liu T (2019) Exploiting persona information for diverse generation of conversational responses. arXiv preprint arXiv:1905.12188. https://doi.org/10.48550/arXiv.1905.12188 Zhang S, Dinan E, Urbanek J, Szlam A, Kiela D, Weston J (2018) Personalizing dialogue agents: I have a dog, do you have pets too?. arXiv preprint arXiv:1801.07243. https://doi.org/10.48550/arXiv.1801.07243 Li J, Monroe W, Ritter A, Galley M, Gao J, Jurafsky D (2016) Deep reinforcement learning for dialogue generation. arXiv preprint arXiv:1606.01541. https://doi.org/10.48550/arXiv.1606.01541 Li J, Galley M, Brockett C, Spithourakis GP, Gao J, Dolan B (2016) A persona-based neural conversation model. arXiv preprint arXiv:1603.06155. https://doi.org/10.48550/arXiv.1603.06155 Mazaré PE, Humeau S, Raison M, Bordes A (2018) Training millions of personalized dialogue agents. arXiv preprint arXiv:1809.01984. https://doi.org/10.48550/arXiv.1809.01984 Wolf T, Sanh V, Chaumond J, Delangue C (2019) Transfertransfo: a transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149. https://doi.org/10.48550/arXiv.1901.08149 Hasson U, Ghazanfar AA, Galantucci B, Garrod S, Keysers C (2012) Brain-to-brain coupling: a mechanism for creating and sharing a social world. Trends Cogn Sci 16(2):114–121 Liu Q, Chen Y, Chen B, Lou JG, Chen Z, Zhou B, Zhang D (2020) You impress me: dialogue generation via mutual persona perception. arXiv preprint arXiv:2004.05388. https://doi.org/10.48550/arXiv.2004.05388 Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9 Ji B (2023) Based on text augmentation personalized dialog generation with persona-sparse data. In: 2023 4th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), Nanjing, China, pp 717–720.https://doi.org/10.1109/AINIT59027.2023.10212566 Pandey S, Sharma S, Wazir S (2022) Mental healthcare chatbot based on natural language processing and deep learning approaches: ted the therapist. Int J Inf Technol 14(7): 3757–3766. https://doi.org/10.1007/s41870-022-00999-6 Bajaj D, Goel A, Gupta SC, Batra H (2022) MUCE: a multilingual use case model extractor using GPT-3. Int J Inf Technol 14(3):1543–1554 Ali I, Yadav D (2021) Question reformulation based question answering environment model. Int J Inf Technol 13(1):59–67. https://doi.org/10.1007/s41870-019-00332-8 Rajan RP, Jose DV (2023) Text summarization using residual-based temporal attention convolutional neural network. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01581-4 Shafi N, Chachoo MA (2023) Query intent recognition by integrating latent dirichlet allocation in conditional random field. Int J Inf Technol 15(1):183–191. https://doi.org/10.1007/s41870-022-01108-3 Vajrobol V, Aggarwal N, Shukla U, Saxena GJ, Singh S, Pundir A (2023) Explainable cross-lingual depression identification based on multi-head attention networks in Thai context. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01512-3 Youness F, Madkour MA, Elshenawy A (2023) Dialog generation for Arabic chatbot. Int J Inf Technol. https://doi.org/10.1007/s41870-023-01519-w Narynov S, Zhumanov Z, Gumar A, Khassanova M and Omarov B (2021) Chatbots and conversational agents in mental health: a literature review. In: 2021 21st International Conference on Control, Automation and Systems (ICCAS), Jeju, Korea, Republic of, pp 353–358. https://doi.org/10.23919/ICCAS52745.2021.9649855 Khalaf AA, Hashim AHA, Olowolayemo A & Funke R (2021) Artificial intelligent applications for mental health support: a review paper. Engineering Professional Ethics and Education 2021 (ICEPEE'21), 22 Goel R, Vashisht S, Dhanda A and Susan S (2021) An empathetic conversational agent with attentional mechanism. In: 2021 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, pp 1–4. https://doi.org/10.1109/ICCCI50826.2021.9402337 Bahdanau D, Chorowski J, Serdyuk D, Brakel P & Bengio Y (2016) End-to-end attention-based large vocabulary speech recognition. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp 4945–4949). IEEE Xu S, Song H, Wu R and Shi J (2023) A natural language understanding model based on encoding fusion for power marketing indicator answering. In: 2023 2nd Asia Conference on Electrical, Power and Computer Engineering (EPCE), Xiamen, China, pp 13–17. https://doi.org/10.1109/EPCE58798.2023.00011 Maree M, Al-Qasem R & Tantour B (2023) Transforming legal text interactions: leveraging natural language processing and large language models for legal support in Palestinian cooperatives. Int J Inf Technol 16:551–558 (2024). https://doi.org/10.1007/s41870-023-01584-1 Singh SK, Kumar S and Mehra PS (2023) Chat GPT & Google Bard AI: a review. In: 2023 International Conference on IoT, Communication and Automation Technology (ICICAT), Gorakhpur, India, pp 1–6. https://doi.org/10.1109/ICICAT57735.2023.10263706 Sakulwichitsintu S (2023) ParichartBOT: a chatbot for automatic answering for postgraduate students of an open university. Int J Inf Technol 15(3):1387–1397 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30 Khalaf AA, Hashim AHA, Olowolayemo A & Funke R (2023) Generative interactive psychotherapy expert (GIPE) bot. IJCSNS International Journal of Computer Science and Network Security, Vol. 23 No. 4 Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://doi.org/10.48550/arXiv.1810.04805 Serban I, Sordoni A, Lowe R, Charlin L, Pineau J, Courville A, Bengio Y (2017) A hierarchical latent variable encoder-decoder model for generating dialogues. In: Proceedings of the AAAI Conference on Artificial Intelligence Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics Li J, Galley M, Brockett C, Gao J, Dolan B (2015) A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055. https://doi.org/10.48550/arXiv.1510.03055 Ali I, Yadav D (2021) Question reformulation based question answering environment model. Int J Inf Technol 13:59–67 Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32. 8026–8037 arXiv e-prints https://doi.org/10.48550/arXiv.1912.01703 Gu JC, Ling ZH, Zhu X, Liu Q (2019) Dually interactive matching network for personalized response selection in retrieval-based chatbots. arXiv preprint arXiv:1908.05859. https://doi.org/10.48550/arXiv.1908.05859 Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473. https://doi.org/10.48550/arXiv.1409.0473 Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980