Semantic-aware conditional variational autoencoder for one-to-many dialogue generation

Neural Computing and Applications - Tập 34 - Trang 13683-13695 - 2022
Ye Wang1, Jingbo Liao1, Hong Yu1, Jiaxu Leng1
1Chongqing Key Laboratory of Computational Intelligence (Chongqing University of Posts and Telecommunications), Chongqing, China

Tóm tắt

Due to the miscellaneous ambiguity of semantics in open-domain conversation, current deep dialogue models disregard to detect potential emotional and action response features in the latent space, which leads to the general tendency to produce inaccurate and irrelevant sentences. To address this problem, we propose a semantic-aware conditional variational autoencoder that discriminates the sentiment and action responses features in the latent space for one-to-many open-domain dialogue generation. Specifically, explicit controllable variables are leveraged from the proposed module to create diverse conversational texts. This controllable variable can constrain the distribution of the latent space, disentangling the latent space features during training. Furthermore, the feature disentanglement improves the dialogue generation in terms of deep learning interpretability and text quality, which also reveals the latent features of different emotions on the logic of text generation.

Tài liệu tham khảo

Turing AM (1990) Computing machinery and intelligence. In: The philosophy of artificial intelligence, pp 40–66 Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, pp 3104–3112 Yao K, Zhang L, Luo T, Du D, Wu Y (2021) Non-deterministic and emotional chatting machine: learning emotional conversation generation using conditional variational autoencoders. Neural Comput Appl 33(11):5581–5589 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, pp 5998–6008 Potamias RA, Siolas G, Stafylopatis A-G (2020) A transformer-based approach to irony and sarcasm detection. Neural Comput Appl 32(23):17309–17320 Kingma DP, Welling M (2014) Auto-encoding variational Bayes. In: 2nd international conference on learning representations Chen M-Y, Chiang H-S, Sangaiah AK, Hsieh T-C (2020) Recurrent neural network with attention mechanism for language model. Neural Comput Appl 32(12):7915–7923 Weizenbaum J (1966) Eliza-a computer program for the study of natural language communication between man and machine. Commun ACM 9(1):36–45 Young S, Gašić M, Thomson B, Williams JD (2013) POMDP-based statistical spoken dialog systems: a review. Proc IEEE 101(5):1160–1179 Vinyals O, Le Q (2015) A neural conversational model. Computer Science Serban I, Sordoni A, Bengio Y, Courville A, Pineau J (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the AAAI conference on artificial intelligence, vol 30 Wang Y, Wang H, Zhang X, Chaspari T, Choe Y, Lu M (2019) An attention-aware bidirectional multi-residual recurrent neural network (ABMRNN): a study about better short-term text classification. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 3582–3586. IEEE Li J, Galley M, Brockett C, Gao J, Dolan B (2015) A diversity-promoting objective function for neural conversation models. Computer Science Galetzka F, Rose J, Schlangen D, Lehmann J (2021) Space efficient context encoding for non-task-oriented dialogue generation with graph attention transformer. In: Proceedings of the 59th annual meeting of the Association for Computational Linguistics and the 11th international joint conference on natural language processing (volume 1: long papers), pp 7028–7041 Ribeiro MT, Singh SGC (2016) “Why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144 Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777 Zhang Q, Yang Y, Ma H, Wu YN (2019) Interpreting CNNs via decision trees. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6261–6270 Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Proceedings of the 30th international conference on neural information processing systems, pp 2180–2188 KarrasT AT, Laine S (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4401–4410 Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explor Newsl 19(2):25–35 Li J, Monroe W, Ritter A, Galley M, Gao J, Jurafsky D (2016) Deep reinforcement learning for dialogue generation. In: Proceedings of EMNLP Serban I, Sordoni A, Lowe R, Charlin L, Pineau J, Courville A, Bengio Y (2017) A hierarchical latent variable encoder–decoder model for generating dialogues. In: Proceedings of the AAAI conference on artificial intelligence, vol 31 Shang L, Lu Z, Hang L (2015) Neural responding machine for short-text conversation. IEEE Cho K, Merrienboer BV, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. Computer Science Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A (2019) beta-vae: learning basic visual concepts with a constrained variational framework. In: 5th international conference on learning representations, pp 4401–4410 Hu Z, Yang Z, Liang X, Salakhutdinov R, Xing EP (2017) Toward controlled generation of text. In: 34th international conference on machine learning, pp 1587–1596 Wiseman S, Shieber S, Rush A (2018) Learning neural templates for text generation. In: Proceedings of the 2018 conference on empirical methods in natural language processing. Association for Computational Linguistics, Brussels, Belgium, pp 3174–3187 Zhao T, Lee K EM (2018) Unsupervised discrete sentence representation learning for interpretable neural dialog generation. In: Proceedings of the 56th annual meeting of the Association for Computational Linguistics, pp 1098–1107 See A, Roller S, Kiela D, Weston J (2019) What makes a good conversation? how controllable attributes affect human judgments. In: Proceedings of the 2019 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 1702–1723 Ficler J, Goldberg Y (2017) Controlling linguistic style aspects in neural language generation. In: Proceedings of the workshop on stylistic variation. Association for Computational Linguistics, Copenhagen, Denmark, pp 94–104 Li Z, Jiang X, Shang L, Liu Q (2019) Decomposable neural paraphrase generation. In: Proceedings of the 57th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp 3403–3414 Sato M, Suzuki J, Shindo H, Matsumoto Y (2018) Interpretable adversarial perturbation in input embedding space for text. In: the 27th international joint conference on artificial intelligence and the 23rd European conference on artificial intelligence Pang B, Wu YN (2021) Latent space energy-based model of symbol-vector coupling for text generation and classification. In: Proceedings of the 38th international conference on machine learning, vol 139, pp 8359–8370 Shi W, Zhou H, Miao N, Li L (2020) Dispersed exponential family mixture vaes for interpretable text generation. In: Proceedings of the 37th international conference on machine learning, vol 119, pp 8840–8851 Chen C, Peng J, Wang F, Xu J, Wu H (2019) Generating multiple diverse responses with multi-mapping and posterior mapping selection. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI 2019, Macao, China, August 10–16, 2019, pp 4918–4924 Bao S, He H, Wang F, Wu H, Wang H (2020) PLATO: pre-trained dialogue generation model with discrete latent variable. In: Proceedings of the 58th annual meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, pp 85–96 Cui Z, Li Y, Zhang J, Cui J, Wei C, Wang B (2020) Focus-constrained attention mechanism for cvae-based response generation. In: Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16–20 November 2020, pp 2021–2030 Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. In: Advances in neural information processing systems 28: annual conference on neural information processing systems 2015, pp 3483–3491 Wang Y, Zhang X, Lu M, Wang H, Choe Y (2020) Attention augmentation with multi-residual in bidirectional LSTM. Neurocomputing 385:340–347 Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) Dailydialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of the eighth international joint conference on natural language processing, pp 986–995 Papineni K, Roukos S, Ward T, Zhu W (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311–318 Lin, CY (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81 Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization@ACL 2005, pp 65–72 Li J, Galley M, Brockett C, Gao J, Dolan B (2016) A diversity-promoting objective function for neural conversation models. In: The 2016 conference of the North American chapter of the Association for Computational Linguistics: human language technologies, pp 110–119 Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1412–1421