Using different acoustic, lexical and language modeling units for ASR of an under-resourced language – Amharic

Speech Communication - Tập 56 - Trang 181-194 - 2014

Martha Yifiru Tachbelie¹, Solomon Teferra Abate¹, Laurent Besacier²

¹School of Information Sciences, Addis Ababa University, Addis Ababa, Ethiopia

²Laboratoire d’informatique de Grenoble (LIG), Université Joseph Fourier, Grenoble 1, France

Tài liệu tham khảo

Abate, Solomon Teferra, 2006. Automatic Speech Recognition /for Amharic, Ph.D. thesis, University of Hamburg, Germany. Abate, Solomon Teferra, Menzel, Wolfgang, 2007a. Syllable-based speech recognition for Amharic. In: Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, Prague, Chech Republic, pp. 33–40. Abate, Solomon Teferra, Menzel, Wolfgang, 2007. Automatic Speech Recognition for an Under-Resourced Language – Amharic. In: Proceedings of INTERSPEECH 2007, pp. 1541–1544. Abate, Solomon Teferra, Menzel, Wolfgang, Tafila, Bairu, 2005. An Amharic Speech Corpus for Large Vocabulary Continuous Speech Recognition. In: Proceedings of INTERSPEECH-2005, Lisbon, Portugal, pp. 1601–1604. Appleyard, 1995 Azim, 2008, Syllable-based automatic Arabic speech recognition in noisy-telephone Channel, WSEAS Transactions on Signal Processing, 4, 211 Bazzi, Issam, 2002. Modelling Out-of-VocabularyWords for Robust Speech Recognition. Ph.D. Thesis, Massachsetts Institute of Technology, 2002. Bender, 1976 Berhanu, Solomon, 2001. Isolated Amharic Consonant-Vowel Syllable Recognition: An Experiment Using the Hidden Markov Model. M.Sc. Thesis, School of Information Studies for Africa, Addis Ababa University, Ethiopia. Berment, V., 2004. Méthodes pour informatiser les langues et les groupes de langues ‘peu dotées’. Ph.D. thesis, Université Joseph Fourier, Grenoble, France. Besacier, L., Le, V.-B., Boitet, C., Berment, V., 2006. ASR and translation for under-resourced languages. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006, vol. 5, 2006, pp. 1221–1224. Carki, Kenan, Geutner, Petra, Schultz, Tanja, 2000. Turkish LVCSR: towards better speech recognition for agglutinative languages. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 1563–1566. Creutz, Mathias, Lagus, Krista, 2005. Unsupervised Morpheme Segmentation and Morphology Induction from Text Corpora Using Morfessor 1.1. Tech. Rep. A81, Neural Networks Research Center, Helsinki University of Technology. El-Desoky, Amr, Gollan, Christian, Rybach, David, Schlüter, Ralf, Ney, v, 2009. Investigating the use of morphological decomposition and diacritization for improving Arabic LVCSR. In: Proceedings of Interspeech-2009, pp. 2679–2682. Ethnologue, 2004. <http://www.ethnologue.com/show_language.asp?code=AMH>. Gales, Mark, Woodland, Phil, 2006. Recent Progress in Large Vocabulary Continuous Speech Recognition: An HTK Perspective. Ganapathiraju, 2001, Syllable-based large vocabulary continuous speech recognition, IEEE Transactions on Speech and Audio Processing, 9, 358, 10.1109/89.917681 Gelas, Hadrien, Abate, Solomon Teferra, Besacier, Laurent, Pellegrino, F., 2011. Quality assessment of crowdsourcing transcriptions for African languages. In: Proceedings of INTERSPEECH, Florence, Italy. Geutner, Petra, 1995. Using morphology towards better large vocabulary speech recognition systems. In: Proceedings of IEEE International on Acoustics, Speech and Signal Processing, vol. I, pp. 445–448. Girmaw, Molalgne, 2004. An Automatic Speech Recognition System for Amharic, M.Sc. Thesis, Dept. of Signals, Sensors and Systems, Royal Institute of Technology, Stockholm, Sweden. Gruenstein, Alexander, McGraw, Ian, Sutherland, Andrew, 2009. A self-transcribing speech corpus: collecting continuous speech with an online educational game. In: Proceeding of SLaTE, Brighton, UK. Mariam, Sebsibe H., Kishore, S.P., Black, Alan W., Kumar, Rohit, Sangal, Rajeev, 2004. Unit selection voice for amharic using festvox. In: Proceeding of the 5th ISCA Speech Synthesis Workshop, Pittsburgh, PA, pp. 103–107. Haile, 1995, Is syllable weight distinction relevant for Amharic stress assignment?, Journal of Ethiopian Studies, 28, 15 Hämäläinen, Annika, Boves, Lou, de Veth, Johan, 2005. Syllable length acoustic units in large-vocabulary continuous speech recognition. In: Proceedings of SPECOM 2005, pp. 499–502. Hirsimäki, Teemu, Creutz, Mathias, Siivola, Vesa, Kurimo, Mikko, 2005. Morphologically motivated language models in speech recognition. In: Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning, pp. 121–126. Ircing, Pavel, Krbec, Pavel, Hajic, Jan, Psutka, Josef, Khudanpur, Sanjeev, Jelinek, Frederick, Byrne, William, 2001. On large vocabulary continuous speech recognition of highly inflectional language – Czech. In: Proceeding of INTERSPEECH’01, pp. 487–489. Kirchhoff, Katrin, Bilmes, Jeff, Henderson, John, Schwartz, Richard, Noamany, Mohamed, Schone, Pat, Ji, Gang, Das, Sourin, Egan, Melissa, He, Feng, Vergyri, Dimitra, Liu, Daben, Duta, Nicolae, 2002. Novel Speech Recognition Models for Arabic. Tech. Rep., Johns-Hopkins University Summer Research Workshop. Leslau, 2000 Liu, Xunying, Gales, Mark John Francis, Hieronymus, Jim L., Woodland, Philip C., 2011. Investigation of acoustic units for LVCSR systems. In: ICASSP’11, pp. 4872–4875. Marge, Matthew, Banerjee, Satanjeev, Rudnicky, Alexander I., 2010. Using the Amazon mechanical Turk to transcribe and annotate meeting speech for extractive summarization. In: Proceedings of NAACL HLT. McGraw, Ian, Gruenstein, Alexander, Sutherland, Andrew, 2009. A self-labeling speech corpus: collecting spoken words with an online educational game. In: Proceedings of INTERSPEECH. Mohri, Mehryar, Pereira, Fernando, Riley, Michael, 1998. A rational design for a weighted finite-state transducer library. In: Derick Woodand Sheng Yu (Ed.), Automata Implementation, vol. 1436, Lecture Notes in Computer Science, Springer Berlin/Heidelberg, pp. 144–158. Scott, Novotney, Callison-Burch, Chris, 2010. Cheap, fast and good enough: automatic speech recognition with non-expert transcription. In: Proceedings of NAACL HLT, pp. 207–215. Pellegrini, Thomas, Lamel, Lori, 2006. Investigating automatic decomposition for ASR in less represented languages. In: Proceedings of INTERSPEECH 2006. Pellegrini, Thomas, Lamel, Lori, 2006. Experimental detection of vowel pronunciation variants in Amharic. In: Proceedings of LREC. Pellegrini, Thomas, Lamel, Lori, 2007. Using phonetic features in unsupervised word decompounding for ASR with application to a less-represented language. In: Proceedings of INTERSPEECH 2007, pp. 1797–1800. Pellegrini, 2009, Automatic word decompounding for ASR in a morphologically rich language: application to Amharic, IEEE Transactions on Audio, Speech, and Language Processing, 17, 863, 10.1109/TASL.2009.2022295 Seid, Hussien, Gambäck, Björn,2005. A speaker independent continuous speech recognizer for Amharic. In: Proceedings of INTERSPEECH 2005, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, pp. 3349–3352. Seifu, Zegaye, 2003. HMM Based Large Vocabulary, Speaker Independent, Continuous Amharic Speech Recognizer. M.Sc. Thesis, School of Information Studies for Africa, Addis Ababa University, Ethiopia. Sethy, Abhinav, Narayanan, Shrikanth, Parthasarthy, S., 2002. A syllable based approach for improved recognition of spoken names. In: Proceeding of the 5th ISCA Pronunciation Modeling Workshop, pp. 30–35. Seyoum, Mulugeta, 2001. The Syllable Structure and Syllablification in Amharic. Masters thesis, Department of Linguistics, Trondheim, Norway. Siivola, Vesa, Hirsimäki, Teemu, Creutz, Mathias, Kurimo, Mikko. Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner. In: Proceedings of Eurospeech, pp. 2293–2296. Snow, Rion, O’Connor, Brendan, Jurafsky, Daniel, Ng, Andrew Y., 2008. Cheap and fast – but is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of EMNLP’08, pp. 254–263. Stolcke, Andreas, 2002. SRILM – an extensible language modeling toolkit. In: Proceedings of ICSLP-2002. Denber, Colorado, USA, pp. 901–904. Tachbelie, Martha Yifiru, 2003. Automatic Amharic Speech Recognition System to Command and Control Computers, M.Sc. Thesis, School of Information Studies for Africa, Addis Ababa University, Ethiopia. Tachbelie, Martha Yifiru, 2010. Morphology-Based Language Modeling for Amharic. Ph.D. thesis, University of Hamburg, Germany. Tachbelie, Martha Yifiru, Abate, Solomon Teferra, Menzel, Wolfgang, 2009. Morpheme-based language modeling for Amharic speech recognition. In: Proceedings of the 4th Language and Technology Conference – LTC-09, pp. 114–118. Tachbelie, Martha Yifiru, Abate, Solomon Teferra, Menzel, Wolfgang. Morpheme-based automatic speech recognition for a morphologically rich language – Amharic. In: Proceeding of SLTU’10, Penang, Malaysia, pp. 68–73. Tachbelie, Martha Yifiru, Abate, Solomon Teferra, Menzel, Wolfgang, 2011. Morpheme-based and factored language modeling for Amharic speech recognition. Lecture Notes in Computer Science, Human Language Technology: Challenges for Computer Science and Linguists, vol. 6562, pp. 82–93. Tachbelie, Martha Yifiru, Abate, Solomon Teferra, Besacier, Laurent, 2011. Part-of-speech tagging for under-resourced and morphologically rich languages – the case of Amharic. In: Proceedings of the HLTD 2011, pp. 50–55. Tadesse, Kinfe, 2002. Sub-Word Based Amharic Speech Recognizer: An Experiment Using Hidden Markov Model (HMM). M.Sc. Thesis, School of Information Studies for Africa, Addis Ababa University, Ethiopia. Thangarajan, 2008, Syllable based continuous speech recognition for Tamil, South Asian Language Review, 17, 71 Voigt, 1987, The classification of central semitic, Journal of Semitic Studies, 32, 1, 10.1093/jss/XXXII.1.1 Whittaker, E.W.D., Woodland, P.C., 2000. Particle-based language modeling. In: Proceeding of International Conference on Spoken Language Processing, pp. 170–173. Whittaker, E.W.D., Van Thong, J.M., Moreno, P.J., 2001. Vocabulary independent speech recognition using particles. In: IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 315–318. Woodland, P.C., Leggetter, C.J., Odell, J.J., Valtchev, V., Young, S.J., 1995. The 1994 HTK large vocabulary speech recognition system. In: Proceedings of the 1995 International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 73–76. Yimam, Baye, 2007. yəamarɨNa səwasəw, second ed., EMPDE, Addis Ababa.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA