Multilingual rule-based approach to number expansion: Framework, extensions and application

International Journal of Speech Technology - Tập 9 - Trang 29-40 - 2006

Marko Moberg¹, Kimmo Pärssinen¹

¹Nokia Technology Platforms, Tampere, Finland

Tóm tắt

The language development of a multilingual text-to-speech system requires contribution from linguists and native speakers of a given language. Text normalization including number expansion is one of the language-specific processing steps. The most available solutions do not support inflections and are not simple enough to be practical for non-technical developers. This paper presents a novel solution for expressing the number expansion rules. The rule framework is fast and easy to use without technical background and truly multilingual supporting gender-specific inflections of numerals. The rules require only a small amount of memory and are conveniently stored as software independent language data. The same rule framework can be extended to carry out other text-normalization tasks including processing of context-dependent abbreviations and interpretation of formatted text such as date and time expressions. The framework has been successfully used in creating number, unit and time conversion rules for 42 languages. The created rules supported cardinal numbers from 0 to 999999 and 13 units such as m, km, h and min. Professional translators without technical background generated the rules for most of the languages. The average number of rule lines for number, unit and time rules were 87, 49 and 13, respectively. The average development time for a full rule set was seven hours per language. The most complex rule sets were in Slavonic languages whereas the simplest ones were in Sino-Tibetan languages.

Tài liệu tham khảo

Allen, J., Hunnicutt, S.M., and Klatt, D. (1987). From Text to Speech: The MITalk System. Cambridge: Cambridge University Press. Black, A., Taylor, P., and Caley, R. (1999). The Festival Speech Synthesis System, System Documentation, edition 1.4, Retrieved November 9, 2005, from University of Edinburgh, Centre for Speech Technology Research Web site: http://www.cstr.ed.ac.uk/projects/festival/manual/ Corbett, G.G. (2000). Number [Electronic version]. Cambridge: Cambridge University Press. Dutoit, T. (1997). An Introduction to Text-to-Speech Synthesis. Dordrecht: The Netherlands: Kluwer Academic Publishers. Embar, M. (1999). Num-Num [Computer software and manual]. Retrieved November 9, 2005, from http://www.num-num.com/Num-Num.htm Gillam, R. (1998). A Rule-Based Approach to Number Spellout. Paper presented at the 1998 12th International Unicode/ISO 10646 Conference in Tokyo, Japan. Retrieved November 9, 2005, from http://www.concentric.net/∼ rtgillam/pubs/NumberSpellout.htm Johnson, R. and Wichern, D. (1988). Applied Multivariate Statistical Analysis. Fourth edition, Upper Saddle River, NJ: Prentice Hall. Pfister, B. and Traber, C. (1994). Text-to-speech synthesis: An introduction and a case study. In E. Keller (Ed.), Fundamentals of Speech Synthesis and Speech Recognition. West Sussex, England: John Wiley, and Sons Ltd., pp. 87–107. Shih, C. and Sproat, R. (1996). Issues in text-to-speech conversion for Mandarin. Computational Linguistics and Chinese Language Processing, 1(1):37–86. Sigurd, B. (1973). From numbers to numerals and vice versa. In A. Zampolli and N. Calzolari (Ed.), Proceedings of the International Conference on Computational Linguistics (COLING 1973): Vol. 2. Firenze, Italy, pp. 429–455 Leo. S. Olschki Editore. Sproat, R. (Ed.). (1998). Multilingual Text-to-Speech Synthesis: The Bell Labs Approach. Dordrecht: Kluwer Academic Publishers. Sproat, R., Black, A., Chen, S., Kumar, S., Ostendorf, M., and Richards, C. (2001). Normalization of non-standard words. Computer Speech and Language, 15(3): 287–333. The Unicode consortium (2003). Conformance, 3.9 Unicode encoding forms.The Unicode Standard, Version 4.0. Boston, MA: Addison-Wesley. Retrieved November 9, 2005, from http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf#G31703. Volk, N. (2004). Suomenkielisen Tekstin Laventaminen Puhesynteesin Laadun Parantamiseksi [Expansion of Finnish text for improving the quality of speech synthesis] (Master’s thesis, University of Helsinki, 2004). Department of General Linguistics. Wall, L., Christiansen, T., and Schwartz, R.L. (1996). Programming Perl. Second edition, Sebastopol, CA: O’Reilly, and Associates, Inc. Xydas, G., Karberis, G., and Kouroupetroglou, G. (2004). Text normalization for the pronunciation of non-standard words in an inflected language. In G.A. Vouros and T. Panayiotopoulos (Eds.), Methods and Applications of Artificial Intelligence: Third Hellenic Conference on AI, SETN 2004, Samos, Greece, May 5–8, 2004, Proceedings / Lecture Notes in Artificial Intelligence (LNAI): Vol. 3025. Berlin Heidelberg: Springer-Verlag, pp. 390–399.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích ảnh hưởng của các bài báo, công bố khoa học Việt Nam và Quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ SciBase

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Hệ thống hội thảo khoa học Việt Nam

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA

Thông tin liên hệ & hỗ trợ

Đơn vị chủ quản, phát triển và vận hành: Công ty Cổ phần Metis

Địa chỉ liên hệ: 26A Lê Đức Thọ, Phường Từ Liêm, Thành phố Hà Nội

Số giấy chứng nhận ĐKKD: 0109293202 cấp ngày 03/08/2020 tại Sở Kế hoạch và Đầu tư thành phố Hà Nội

Người quản lý và chịu trách nhiệm nội dung: Nguyễn Ngọc Sơn

Hotline: 0566.685.688

Email: [email protected]