The IBM speech-to-speech translation system for smartphone: Improvements for resource-constrained tasks

Computer Speech & Language - Tập 27 - Trang 592-618 - 2013
Bowen Zhou1, Xiaodong Cui1, Songfang Huang1, Martin Cmejrek1, Wei Zhang1, Jian Xue1, Jia Cui1, Bing Xiang1, Gregg Daggett1, Upendra Chaudhari1, Sameer Maskey1, Etienne Marcheret1
1IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, United States

Tài liệu tham khảo

Afify, 2007, Stereo-based stochastic mapping for robust speech recognition, 377 Bach, 2009, Incremental adaptation of speech-to-speech translation, 149 Belvin, 2005, Transonics: a practical speech-to-speech translator for English-Farsi medical dialogues, 89 Brown, 1993, The mathematics of statistical machine translation: parameter estimation, Computational Linguistics, 19, 263 Chen, 2003, Conditional and joint models for grapheme-to-phoneme conversion Chen, 1999, An empirical study of smoothing techniques for language modeling, Computer Speech & Language, 13, 359, 10.1006/csla.1999.0128 Chen, S.F., Rosenfeld, R., 1999. A Gaussian Prior for Smoothing Maximum Entropy Models. Technical Report. Technical Report CMU-CS-99-108, Computer Science Department, Carnegie Mellon University. Chen, 2011, Clustering of bootstrapped acoustic model with full covariance, 4496 Chiang, 2007, Hierarchical phrase-based translation, Computational Linguistics, 33, 201, 10.1162/coli.2007.33.2.201 Chiang, 2009, Isi/language weaver nist 2009 systems Chiang, 2009, 11,001 new features for statistical machine translation, 218 Cmejrek, 2009, Enriching scfg rules directly from efficient bilingual chart parsing, 136 Condon, S., Arehart, M., Parvaz, D., Sanders, G., Doran, C., Aberdeen, J., 2011. Evaluation of 2-way Iraqi Arabic-English speech translation systems using automated metrics. Technical Report. MITRE Corp. Cui, 2011, Efficient representation and fast look-up of maximum entropy language models Cui, 2007, Investigating linguistic knowledge in a maximum entropy token-based language model Cui, 2008, MMSE-based stereo feature stochastic mapping for noise robust speech recognition, 4077 Cui, 2009, Stereo-based stochastic mapping with discriminative training for noise robust speech recognition, 3933 Cui, X., Chen, X., Xue, J., Olsen, P.A., Hershey, J.R., Zhou, B.,2011 Acoustic modeling with bootstrap and restructuring based on full covariance. In: Proc. of Interspeech, pp. 1697–1700. Cui, 2010, Acoustic modeling with bootstrap and restructuring for low-resourced languages, 2974 Cui, 2009, Improving online incremental speaker adaptation with Eigen feature space MLLR, 136 DARPA, Spoken language communication and translation system for tactic use. http://www.darpa.mil/IPTO/programs/transtac/transtac.asp. Dempster, 1977, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, 39, 1 Drábek, 2004, Improving bitext word alignments via syntax-based reordering of English Fernandez, 2005, Toward multiple-language TTS: experiments in English and Mandarin, 1473 Gao, 2006, Ibm mastor: multilingual automatic speech-to-speech translator, 1205 Gu, 2008, High-performance low-latency speech recognition via multi-layered feature streaming and fast gaussian computation, 2098 Guzman, 2009, Reassessment of the role of phrase extraction in PBSMT Huang, 2009, An EM algorithm for SCFG in formal syntax-based translation, 4813 Huang, 2011, An empirical study on improving hierarchical phrase-based translation using alignment features Huang, 2011, Using features from topic models to alleviate over-generation in hierarchical phrase-based translation Huang, 2010, Soft syntactic constraints for hierarchical phrase-based translation using latent syntactic distributions, 138 Koehn, 2003, Statistical phrase-based translation Marcheret, 2009, Optimal quantization and bit allocation for compressing large discriminative feature space transforms, 64 Maskey, 2010, A power mean based algorithm for combining multiple alignment tables Och, 2003, Minimum error rate training in statistical machine translation, 160 Och, 2000, Improved statistical alignment models, 440 Och, 2003, A systematic comparison of various statistical alignment models, Computational Linguistics, 29, 19, 10.1162/089120103321337421 Papineni, 2002, BLEU: a method for automatic evaluation of machine translation, 311 Porter, 1980, An algorithm for suffix stripping, 130 Povey, 2005, Improvements to fMPE for discriminative training of features, 2977 Povey, 2008, Boosted mmi for model and feature space discriminative training Povey, 2008, Boosted MMI for model and feature-space discriminative training, 4060 Povey, 2005, fMPE: discriminatively trained features for speech recognition, 961 Precoda, 2007, IraqComm: a next generation translation system, 2841 Sakti, 2009, The Asian network-based speech-to-speech translation system Sanders, 2011, Evaluation methodology and metrics employed to assess the transtac two-way, speech-to-speech translation systems. Computer Speech and Language Schlenoff, 2009, Evaluating speech translation systems: applying score to transtac technologies, 223 Stallard, 2007, The BBN 2007 displayless English/Iraqi speech-to-speech translation system, 2817 TC-STAR. Tc-star evaluation workshop on speech-to-speech translation. http://www.elda.org/tcstar-workshop/index.htm. Vogel, 1996, Hmm-based word alignment in statistical translation, 836 Wang, 2007, Chinese syntactic reordering for statistical machine translation, 737 Watts, 2011, Unsupervised features from text for speech synthesis in a speech-to-speech translation system Xiang, 2010, Diversify and combine: improving word alignment for machine translation on low-resource languages, 22 Xiang, 2006, Morphological decomposition for arabic broadcast news transcription Xu, 2009, Using a dependency parser to improve smt for subject–object–verb languages, 245 Zens, 2007, Efficient phrase-table representation for machine translation with applications to online mt and speech translation, 492 Zhang, 2010, Applying log linear model based context dependent machine translation techniques to grapheme-to-phoneme conversion Zhang, 2010, Applying scalable phonetic context similarity in unit selection of concatenative text-to-speech, 154 Zhang, 2009, Recent improvements of probability based prosody models for unit selection in concatenative text-to-speech, 3777 Zhou, 2006, FOLSOM: a fast and memory-efficient phrase-based approach to statistical machine translation, 226 Zhou, 2008, Prior derivation models for formally syntax-based translation using linguistically syntactic parsing and tree kernels