Multilingual rule-based approach to number expansion: Framework, extensions and application
Tóm tắt
The language development of a multilingual text-to-speech system requires contribution from linguists and native speakers of a given language. Text normalization including number expansion is one of the language-specific processing steps. The most available solutions do not support inflections and are not simple enough to be practical for non-technical developers. This paper presents a novel solution for expressing the number expansion rules. The rule framework is fast and easy to use without technical background and truly multilingual supporting gender-specific inflections of numerals. The rules require only a small amount of memory and are conveniently stored as software independent language data. The same rule framework can be extended to carry out other text-normalization tasks including processing of context-dependent abbreviations and interpretation of formatted text such as date and time expressions. The framework has been successfully used in creating number, unit and time conversion rules for 42 languages. The created rules supported cardinal numbers from 0 to 999999 and 13 units such as m, km, h and min. Professional translators without technical background generated the rules for most of the languages. The average number of rule lines for number, unit and time rules were 87, 49 and 13, respectively. The average development time for a full rule set was seven hours per language. The most complex rule sets were in Slavonic languages whereas the simplest ones were in Sino-Tibetan languages.
Tài liệu tham khảo
Allen, J., Hunnicutt, S.M., and Klatt, D. (1987). From Text to Speech: The MITalk System. Cambridge: Cambridge University Press.
Black, A., Taylor, P., and Caley, R. (1999). The Festival Speech Synthesis System, System Documentation, edition 1.4, Retrieved November 9, 2005, from University of Edinburgh, Centre for Speech Technology Research Web site: http://www.cstr.ed.ac.uk/projects/festival/manual/
Corbett, G.G. (2000). Number [Electronic version]. Cambridge: Cambridge University Press.
Dutoit, T. (1997). An Introduction to Text-to-Speech Synthesis. Dordrecht: The Netherlands: Kluwer Academic Publishers.
Embar, M. (1999). Num-Num [Computer software and manual]. Retrieved November 9, 2005, from http://www.num-num.com/Num-Num.htm
Gillam, R. (1998). A Rule-Based Approach to Number Spellout. Paper presented at the 1998 12th International Unicode/ISO 10646 Conference in Tokyo, Japan. Retrieved November 9, 2005, from http://www.concentric.net/∼ rtgillam/pubs/NumberSpellout.htm
Johnson, R. and Wichern, D. (1988). Applied Multivariate Statistical Analysis. Fourth edition, Upper Saddle River, NJ: Prentice Hall.
Pfister, B. and Traber, C. (1994). Text-to-speech synthesis: An introduction and a case study. In E. Keller (Ed.), Fundamentals of Speech Synthesis and Speech Recognition. West Sussex, England: John Wiley, and Sons Ltd., pp. 87–107.
Shih, C. and Sproat, R. (1996). Issues in text-to-speech conversion for Mandarin. Computational Linguistics and Chinese Language Processing, 1(1):37–86.
Sigurd, B. (1973). From numbers to numerals and vice versa. In A. Zampolli and N. Calzolari (Ed.), Proceedings of the International Conference on Computational Linguistics (COLING 1973): Vol. 2. Firenze, Italy, pp. 429–455 Leo. S. Olschki Editore.
Sproat, R. (Ed.). (1998). Multilingual Text-to-Speech Synthesis: The Bell Labs Approach. Dordrecht: Kluwer Academic Publishers.
Sproat, R., Black, A., Chen, S., Kumar, S., Ostendorf, M., and Richards, C. (2001). Normalization of non-standard words. Computer Speech and Language, 15(3): 287–333.
The Unicode consortium (2003). Conformance, 3.9 Unicode encoding forms.The Unicode Standard, Version 4.0. Boston, MA: Addison-Wesley. Retrieved November 9, 2005, from http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf#G31703.
Volk, N. (2004). Suomenkielisen Tekstin Laventaminen Puhesynteesin Laadun Parantamiseksi [Expansion of Finnish text for improving the quality of speech synthesis] (Master’s thesis, University of Helsinki, 2004). Department of General Linguistics.
Wall, L., Christiansen, T., and Schwartz, R.L. (1996). Programming Perl. Second edition, Sebastopol, CA: O’Reilly, and Associates, Inc.
Xydas, G., Karberis, G., and Kouroupetroglou, G. (2004). Text normalization for the pronunciation of non-standard words in an inflected language. In G.A. Vouros and T. Panayiotopoulos (Eds.), Methods and Applications of Artificial Intelligence: Third Hellenic Conference on AI, SETN 2004, Samos, Greece, May 5–8, 2004, Proceedings / Lecture Notes in Artificial Intelligence (LNAI): Vol. 3025. Berlin Heidelberg: Springer-Verlag, pp. 390–399.