Fast vocabulary acquisition in an NMF-based self-learning vocal user interface

Computer Speech & Language - Tập 28 - Trang 997-1017 - 2014
Bart Ons1, Jort F. Gemmeke1, Hugo Van hamme1
1Department ESAT-PSI, KU Leuven, Leuven, Belgium

Tài liệu tham khảo

Akata, 2011, Non-negative matrix factorization in multimodality data for segmentation and label prediction Altosaar, 2010, A speech corpus for modeling language acquisition: Caregiver ten Bosch, 2009, On a computational model for language acquisition: modeling cross-speaker generalisation, 315 Boves, 2007, Acorns-towards computational modeling of communication and recognition skills, 349 Caicedo, 2012, Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization, Neurocomputing, 76, 50, 10.1016/j.neucom.2011.04.037 Clark, 1989, Contributing to discourse, Cogn. Sci., 13, 259, 10.1207/s15516709cog1302_7 Clemente, 2012, Incremental word learning: efficient hmm initialization and large margin discriminative adaptation, Speech Commun., 54, 1029, 10.1016/j.specom.2012.04.005 Davis, 1980, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Trans. Acoust. Speech Signal Process., 28, 357, 10.1109/TASSP.1980.1163420 Demuynck, 2001 Driesen, 2012 Driesen, 2009, Adaptive non-negative matrix factorization in a computational model of language acquisition, 1711 Driesen, 2012, Weakly supervised keyword learning using sparse representations of speech, 5145 Driesen, 2012, Data-driven speech representations for NMF-based word learning Driesen, 2011, Modelling vocabulary acquisition, adaptation and generalization in infants using adaptive Bayesian PLSA, Neurocomputing, 74, 1874, 10.1016/j.neucom.2010.07.036 Driesen, 2011, Modelling vocabulary acquisition, adaptation, and generalization in infants using adaptive bayesian plsa, Neurocomputing, 74, 1874, 10.1016/j.neucom.2010.07.036 Driesen, 2012, Fast word acquisition in an NMF-based learning framework, 5137 Driesen, 2012, Supervised input space scaling for non-negative matrix factorization, Signal Process, 92, 1864, 10.1016/j.sigpro.2011.07.016 Gemmeke, 2013, Self-taught assistive vocal interfaces: an overview of the aladin project Heinroth, 2012, Adaptive speech understanding for intuitive model-based spoken dialogues, 1281 Demuynck, 2008, Spraak: an open source speech recognition and automatic annotation kit Kuhn, 2000, Rapid speaker adaptation in eigenvoice space, IEEE Trans. Speech Audio Process., 8, 695, 10.1109/89.876308 Lee, 1999, Learning the parts of objects by nonnegative matrix factorization, Nature, 401, 788, 10.1038/44565 van de Loo, 2012, Towards a self-learning assistive vocal interface: vocabulary and grammar learning Miyawaki, 1975, An effect of linguistic experience: the discrimination of [r] and [l] by native speakers of Japanese and English, Atten. Percept. Psychophys., 18, 331, 10.3758/BF03211209 Ons, 2012, Label noise robustness and learning speed in a self-learning vocal user interface Ons, 2013, NMF-based keyword learning from scarce data Ons, 2013, A self learning vocal interface for speech-impaired users, 1 Oostdijk, 2000, The spoken Dutch corpus. Overview and first evaluation Paek, 2007, Improving command and control speech recognition on mobile devices: using predictive user models for language modeling, User Model. User-Adap. Inter., 17, 93, 10.1007/s11257-006-9021-6 Parker, 2006, Automatic speech recognition and training for severely dysarthric users of assistive technology: the stardust project, Clin. Linguist. Phon., 20, 149, 10.1080/02699200400026884 Potamianos, 1998, Spoken dialog systems for children, 197 Quine, 1964, vol. 4 Rabiner, 1989, A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE, 77, 257, 10.1109/5.18626 Robinson, 1995, Wsjcam0: A British English speech corpus for large vocabulary continuous speech recognition Stouten, 2008, Discovering phone patterns in spoken utterances by non-negative matrix factorization, IEEE Signal Processing Letters, 15, 131, 10.1109/LSP.2007.911723 Sun, 2012 Sun, 2011, Image pattern discovery by using the spatial closeness of visual code words, 205 Sun, 2011, A two-layer non-negative matrix factorization model for vocabulary discovery Sun, 2012, Tri-factorization learning of sub-word units with application to vocabulary acquisition, 5177 Sun, 2013, Joint training of non-negative tucker decomposition and discrete density hidden Markov models, Comput. Speech Lang., 27, 969, 10.1016/j.csl.2012.09.006 Van hamme, 2008, Hac-models: a novel approach to continuous speech recognition, 255 Van Segbroeck, 2009, Unsupervised learning of time-frequency patches as a noise-robust representation of speech, Speech Commun., 51, 1124, 10.1016/j.specom.2009.05.003 Werker, 1988, Cross-language speech perception: Initial capabilities and developmental change, Dev. Psychol., 24, 672, 10.1037/0012-1649.24.5.672 Wessel, 2001, Confidence measures for large vocabulary continuous speech recognition, IEEE Trans. Speech Audio Process., 9, 288, 10.1109/89.906002