The statistical approach to spoken language translation

H. Ney1
1Lehrstuhl für Informatik VI, Computer Science Department, RWTH-Aachen-University of Technology, Aachen, Germany

Tóm tắt

This paper gives an overview of our work on statistical machine translation of spoken dialogues, in particular in the framework of the VERBMOBIL project. The goal of the VERBMOBIL project is the translation of spoken dialogues in the domains of appointment scheduling and travel planning. Starting with the Bayes decision rule as in speech recognition; we show how the required probability distributions can be structured into three parts: the language model, the alignment model and the lexicon model. We describe the components of the system and report results on the VERBMOBIL task. The experience obtained in the VERBMOBIL project, in particular a largescale end-to-end evaluation, showed that the statistical approach resulted in significantly lower error rates than three competing translation approaches: the sentence error rate was 29% in comparison with 52% to 62% for the other translation approaches. Finally, we discuss the integrated approach to speech translation as opposed to the serial approach that is widely used nowadays.

Từ khóa

#Natural languages #Hidden Markov models #Probability distribution #Search problems #Performance loss

Tài liệu tham khảo

lavie, 1995, JANUS Multi-lingual translation of spontaneous speech in a limited domain 2nd Conf of the Association for Machine Translation in the Americas, 252 10.1109/ICASSP.1999.758176 10.1109/89.817451 nießen, 2000, Improving SMT quality with morpho-syntactic 'analysis, 18th Int Conf on Computational Linguistics, 1081 nießen, 2000, An evaluation tool for machine translation: Fast evaluation for MT research, Int Conf on Language Resources and Evaluation, 39 nießen, 1998, A DP based search algorithm for statistical machine translation, COLING-ACL '98 36th Annual Meeting of the Association for Computational Linguistics and 17th Int Conf on Computational Linguistics, 960 och, 1999, An efficient method to determine bilingual word classes 9th Conf of the European Chapter of the Association for Computational Linguistics, 71 10.3115/992730.992810 och, 1999, Improved alignment models for statistical machine translation, Proc Joint SIGDAT Conf Empirical Methods in Natural Language Processing and Very Large Corpora, 20 10.3115/1118037.1118045 10.3115/993268.993313 becker, 0, The Verbmobil generation component VM-GECO, 481 vogel, 2000, Translation with Cascaded Finite-State Transducers ACL Conf (Assoc for Comput Linguistics), 23 batliner, 0, The Prosody Module, 106 brown, 1993, The mathematics of statistical machine translation: Parameter estimation, Computational Linguistics, 19, 263 wahlster, 2000, Verbmobil Foundations of speech-to-speech translations, 10.1007/978-3-662-04230-4 10.3115/1075812.1075844 10.1007/978-3-662-04230-4_26 10.1109/ASRU.2001.1034664 10.1007/978-3-662-04230-4_30 2000, EuTrans Project Instituto Tecnologico de Informática (ITI Spain) Fondazione Ugo Bordoni (FUB Italy) RWTH Aachen Lehrstuhl f Informatik VI (Germany) Zeres GmbH Bochum (Germany) Example-Based Language Translation Systems Final report of the EuTrans project (EU project number 30268) alshawi, 1997, English-to-Mandarin Speech Translation with Head Transducers, Spoken Language Translation Workshop (SLT-97), 54 10.1007/978-3-662-04230-4_31 spilker, 0, Processing self-corrections in a speech-to-speech system, 131 sawaf, 2000, On the use of grammar based language models for statistical machine translation, 6th Int Workshop on Parsing Technologies, 231 tillmann, 2000, Word re-ordering in a DP-based approach to statistical MT, 18th Int Conf on Computational Linguistics 2000, 850 tessiore, 0, Functional validation of a machine translation system: Verbmobil, 611 10.1109/ICASSP.1997.599563 uszkoreit, 0, Deep linguistic analysis with HPSG, 216