Web-based tools and methods for rapid pronunciation dictionary creation

Speech Communication - Tập 56 - Trang 101-118 - 2014
Tim Schlippe1, Sebastian Ochs1, Tanja Schultz1
1Institute for Anthropomatics, Cognitive Systems Lab (CSL), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

Tài liệu tham khảo

Schultz, T., Black, A.W., Badaskar, S., Hornyak, M., Kominek, J., 2007. SPICE: web-based tools for rapid language adaptation in speech processing systems. In: Interspeech. Martirosian, O., Davel, M., 2007. Error analysis of a public domain pronunciation dictionary. In: PRASA. Vu, N.T., Schlippe, T., Kraus, F., Schultz, T., 2010. Rapid bootstrapping of five Eastern European languages using the rapid language adaptation toolkit. In: Interspeech. IPA, 1999. Handbook of the International Phonetic Association: a guide to the use of the International Phonetic Alphabet, Cambridge University Press. Zhu, X., Rosenfeld, R., 2001. Improving trigram language modeling with the World Wide Web. In: ICASSP. Black, A.W., Lenzo, K., Pagel, V., 1998. Issues in building general letter to sound rules. In: ESCA Workshop on Speech Synthesis. Kominek, J., Black, A.W., 2006. Learning pronunciation dictionaries: language complexity and word selection strategies. In: HLT Conference of the NAACL. Davel, M., Barnard, E., 2004. The efficient generation of pronunciation dictionaries: human factors during bootstrapping. In: ICSLP. Ghoshal, A., Jansche, M., Khudanpurv, S., Riley, M., Ulinski, M., 2009. Web-derived pronunciations. In: ICASSP. Can, D., Cooper, E., Ghoshal, A., Jansche, M., Khudanpur, S., Ramabhadran, B., Riley, M., Saraclar, M., Sethy, A., Ulinski, M., White, C., 2009. Web derived pronunciations for spoken term detection. In: 32nd Annual International ACM SIGIR Conference. Schlippe, T., Ochs, S., Schultz, T., 2010. Wiktionary as a source for automatic pronunciation extraction. In: Interspeech. Schlippe, T., Ochs, S., Schultz, T., 2012. Grapheme-to-phoneme model generation for Indo-European languages. In: ICASSP. Kaplan, R.M., Kay, M., 1994. Regular models of phonological rule systems. In: Computational Linguistics. Besling, S., 1994. Heuristical and statistical methods for grapheme-to-phoneme conversion. In: Konvens. Kneser, R., 2000. Grapheme-to-phoneme study, Tech. Rep. WYT-P4091/00002, Philips Speech Processing, Germany. Chen, S.F., 2003. Conditional and joint models for grapheme-to-phoneme conversion. In: Eurospeech. Vozila, P., Adams, J., Lobacheva, Y., Ryan, T., 2003. Grapheme to phoneme conversion and dictionary verification using graphonemes. In: Eurospeech. Jiampojamarn,S., Kondrak, G., Sherif, T., 2007. Applying many-to-many alignments and hidden Markov models to letter-to-phoneme conversion. In: HLT. Novak, J., 2011. Phonetisaurus: a WFST-driven phoneticizer. <http://code.google.com/p/phonetisaurus/>. Novak, J., Minematsu, N., Hirose, K., 2012. WFST-based grapheme-to-phoneme conversion: open source tools for alignment, model-building and decoding. In: International Workshop on Finite State Methods and Natural Language Processing. Gerosa, M., Federico, M., 2009. Coping with out-of-vocabulary words: open versus huge vocabulary ASR. In: Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP ’09, ISBN 978-1-4244-2353-8. http://dx.doi.org/10.1109/ICASSP.2009.4960583. Laurent, A., Deléglise, P., Meignier, S., 2009. Grapheme to phoneme conversion using an SMT system. In: INTERSPEECH. Karanasou, P., Lamel, L., 2010. Comparing SMT methods for automatic generation of pronunciation variants. In: Proceedings of the 7th International Conference on Advances in Natural Language Processing, IceTAL’10, ISBN 3-642-14769-0, 978-3-642-14769-2. Bisani, M., Ney, H., 2008. Joint-Sequence models for grapheme-to-phoneme conversion, Speech Communication. Hahn, S., Vozila, P., Bisani, M., 2012. Comparison of grapheme-to-phoneme methods on large pronunciation dictionaries and LVCSR tasks. In: Interspeech, 2012. Davel, M., Martirosian, O., 2009. Pronunciation dictionary development in resource-scarce environments. In: Interspeech. Davel, M., de Wet, F., 2010. Verifying pronunciation dictionaries using conflict analysis. In: Interspeech. Wolff, M., Eichner, M., Hoffmann, R., 2002. Measuring the quality of pronunciation dictionaries. in: PMLA. Davel, M., Barnard, E., 2006. Developing consistent pronunciation models for phonemic variants. In: Interspeech. Kominek, J., 2009. TTS from zero – building synthetic voices for new languages, Doctoral Thesis. Schultz, T., 2002. GlobalPhone: a multilingual speech and text database developed at Karlsruhe University. In: ICSLP. Schlippe, T., Ochs, S., Schultz, T., 2012. Automatic error recovery for pronunciation dictionaries. In: Interspeech. Wells, 1997, SAMPA computer readable phonetic alphabet Wikimedia, 2012. List of Wiktionary editions, ranked by article count. http://meta.wikimedia.org/wiki/ListofWiktionaries. Llitjos, A.F., Black, A.W., 2002. Evaluation and collection of proper name pronunciations online. In: LREC. Kanthak, S., Ney, H., 2002. Context-dependent acoustic modeling using graphemes for large vocabulary speech recognition. In: ICASSP. Killer, M., Stueker, S., Schultz, T., 2003. Grapheme based speech recognition. In: Eurospeech. Stueker, S., Schultz, T., 2004. A grapheme based speech recognition system for Russian. In: SPECOM. Schlippe, T., Djomgang, E.G.K., Vu, N.T., Ochs, S., Schultz, T., 2012. Hausa large vocabulary continuous speech recognition. In: SLT-U.