Overview of the seventh Dialog System Technology Challenge: DSTC7

Computer Speech & Language - Tập 62 - Trang 101068 - 2020
Luis Fernando D’Haro1, Koichiro Yoshino2, Chiori Hori3, Tim K. Marks3, Lazaros Polymenakos4, Jonathan K. Kummerfeld5, Alan Yuille6, Xiang Gao6
1Speech Technology Group. Information Processing and Telecommunications Center (IPTC), ETSI Telecomunicación Universidad Politécnica de Madrid, Ciudad Universitaria, Av. Complutense, 30, Madrid 28040, Spain
2Nara Institute of Science and Technology, Ikoma, Nara 6300192, Japan
3Mitsubishi Electric Research Laboratories (MERL), 201 Broadway, Cambridge, MA, 02139, USA
4Alexa Dialog Science, 101 Main Street, Cambridge, MA, 02142, USA
5University of Michigan, 2260 Hayward Street, Ann Arbor, MI 48109, USA
6Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Alamri, 2019, Audio visual scene-aware dialog

Alamri, H., Cartillier, V., Lopes, R.G., Das, A., Wang, J., Essa, I., et al., 2018a. Audio visual scene-aware dialog (AVSD) challenge at DSTC7. arXiv:1806.00525.

Alamri, 2018, Audio visual scene-aware dialog (AVSD) track for natural language generation in DSTC7

Antol, 2015, VQA: visual question answering, 2425

Bahdanau, 2015, Neural machine translation by jointly learning to align and translate

Banchs, 2015, Adequacy–fluency metrics: evaluating MT in the continuous space model framework, IEEE/ACM Trans. Audio SpeechLang. Process., 23, 472, 10.1109/TASLP.2015.2405751

Carreira, 2017, Quo vadis, action recognition? A new model and the kinetics dataset, 6299

Chen, 2017, Enhanced LSTM for natural language inference, 1, 1657

Chen, 2019, Sequential attention-based network for noetic end-to-end response selection

Das, 2016, Visual dialog, CoRR

Das, 2017, Learning cooperative visual dialog agents with deep reinforcement learning, 2951

D’Haro, 2019, Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics, Comput. Speech Lang., 55, 200, 10.1016/j.csl.2018.12.004

Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2019. Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171-4186. doi: 10.18653/v1/N19-1423.

Dinan, 2020, The Second Conversational Intelligence Challenge (ConvAI2), 10.1007/978-3-030-29135-8_7

Doddington, 2002, Automatic evaluation of machine translation quality using n-gram co-occurrence statistics, 138

Ganhotra, 2019, Knowledge-incorporating ESIM models for response selection in retrieval-based dialog systems

Gao, 2019, Jointly optimizing diversity and relevance in neural response generation

Ghazvininejad, 2018, A knowledge-grounded neural conversation model, AAAI, 10.1609/aaai.v32i1.11977

Gu, 2016, Incorporating copying mechanism in sequence-to-sequence learning, 1631

He, 2017, Generating natural answers by incorporating copying and retrieving mechanisms in sequence-to-sequence learning, 1, 199

Henderson, 2014, The second dialog state tracking challenge, 263

Henderson, 2014, The third dialog state tracking challenge, 324

Hershey, 2017, CNN architectures for large-scale audio classification, 131

Higashinaka, 2019, Overview of the dialogue breakdown detection challenge 4

Hori, 2018, End-to-end audio visual scene-aware dialog using multimodal attention-based video features, 2352

Hori, 2017, End-to-end conversation modeling track in DSTC6

Hori, 2019, Joint student-teacher learning for audio-visual scene-aware dialog, 1886

Hori, 2017, Attention-based multimodal fusion for video description, 4193

Hori, 2019, Overview of the sixth dialog system technology challenge: DSTC6, Comput. Speech Lang., 55, 1, 10.1016/j.csl.2018.09.004

Jiang, 2017, Understanding task design trade-offs in crowdsourced paraphrase collection, 103

Khatri, C., Hedayatnia, B., Venkatesh, A., Nunn, J., Pan, Y., Liu, Q., et al., 2018. Advancing the state of the art in open domain dialog systems through the Alexa prize. arXiv:1812.10757.

Kim, 2017, The fourth dialog state tracking challenge, 435

Kim, 2016, The fifth dialog state tracking challenge, 511

Kingma, 2013, Auto-encoding variational bayes, CoRR, abs/1312.6114

Kumar, 2019, Context, attention and audio feature explorations for audio visual scene-aware dialoge

Kummerfeld, 2019, Slate: a super-lightweight annotation tool for experts

Kummerfeld, J.K., Gouravajhala, S.R., Peper, J., Athreya, V., Gunasekara, C., Ganhotra, J., et al., 2018. Analyzing assumptions in conversation disentanglement research through the lens of a new dataset and model. arXiv:1810.11118.

Lavie, 2007, METEOR: an automatic metric for MT evaluation with high levels of correlation with human judgments, 228

Le, 2019, End-to-end multimodal dialog systems with hierarchical multimodal attention on video features

Li, 2016, A diversity-promoting objective function for neural conversation models, 110

Lin, 2019, Entropy-enhanced multimodal attention model for scene-aware dialogue generation

Lowe, 2015, The UBUNTU dialogue corpus: a large dataset for research in unstructured multi-turn dialogue systems, 285

Nguyen, 2019, From film to video: multi-turn question answering with multi-modal context

Papineni, 2002, Bleu: a method for automatic evaluation of machine translation, 311

Pasunuru, 2019, DSTC7-AVSD: scene-aware video-dialogue systems with dual attention

Perez, 2017, Dialog system technology challenge 6 overview of track 1 - end-to-end goal-oriented dialog learning

Peters, 2018, Deep contextualized word representations, 2227

Qin, 2019, Conversing by reading: contentful neural conversation with on-demand machine reading, 5427

Ritter, 2011, Data-driven response generation in social media, 583

Ruder, 2019, Transfer learning in natural language processing, 15

Sanabria, 2019, CMU sinbad submission for the DSTC7 AVSD challenge

See, 2017, Get to the point: summarization with pointer-generator networks, 1073

Serban, 2018, A survey of available corpora for building data-driven dialogue systems: the journal version, Dialogue Discourse, 9, 1, 10.5087/dad.2018.101

Serban, 2016, Building end-to-end dialogue systems using generative hierarchical neural network models, 3776

Shang, 2015, Neural responding machine for short-text conversation, 1577

Sharma, 2017, Relevance of unsupervised metrics in task-oriented dialogue for evaluating natural language generation, CoRR, abs/1706.09799

Sigurdsson, G.A., Varol, G., Wang, X., Laptev, I., Farhadi, A., Gupta, A., 2016. Hollywood in homes: crowdsourcing data collection for activity understanding. European Conference on Computer Vision. arXiv:1604.01753.

Sordoni, 2015, A neural network approach to context-sensitive generation of conversational responses, 196

Sukhbaatar, 2015, End-to-end memory networks, 2440

Vedantam, 2015, CIDEr: consensus-based image description evaluation, 4566

Vinyals, 2015, A neural conversational model, ICML

Weston, 2015, Memory networks, ICLR

Williams, 2013, The dialog state tracking challenge, 404

Yeh, 2019, Reactive multi-stage feature fusion for multimodal dialogue modeling

Zhuang, 2019, Investigation of attention-based multimodal fusion and maximum mutual information objective for DSTC7 track3