A segmental framework for fully-unsupervised large-vocabulary speech recognition
Tài liệu tham khảo
Abdel-Hamid, 2013, Deep segmental neural networks for speech recognition
Badino, 2014, An auto-encoder based approach to unsupervised learning of subword units
Badino, 2015, Discovering discrete subword units with binarized autoencoders and hidden-Markov-model encoders
Bisani, 2004, Bootstrap estimates for confidence intervals in ASR performance evaluation, 409
Bortfeld, 2005, Mommy and me: familiar names help launch babies into speech-stream segmentation, Psychol. Sci., 16, 298, 10.1111/j.0956-7976.2005.01531.x
Chen, 2015, Parallel inference of Dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study
Chung, 2013, Unsupervised discovery of linguistic structure including two-level acoustic patterns using three cascaded stages of iterative optimization
De Vries, 2014, A smartphone-based ASR data collection tool for under-resourced languages, Speech Commun., 56, 119, 10.1016/j.specom.2013.07.001
Dredze, 2010, NLP on spoken documents without ASR
Eimas, 1999, Segmental and syllabic representations in the perception of speech by young infants, J. Acoust. Soc. Am., 105, 1901, 10.1121/1.426726
Feldman, 2009, Learning phonetic categories by learning a lexicon
Gillick, 2011, Don’t multiply lightly: quantifying problems with the acoustic model assumptions in speech recognition
Gish, 2009, Unsupervised training of an HMM-based speech recognizer for topic classification
Goldwater, 2007, A fully Bayesian approach to unsupervised part-of-speech tagging
Goldwater, 2009, A Bayesian framework for word segmentation: exploring the effects of context, Cognition, 112, 21, 10.1016/j.cognition.2009.03.008
Heymann, 2013, Unsupervised word segmentation from noisy input
Jansen, 2011, Towards unsupervised training of speaker independent acoustic models
Jansen, 2013, A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition
Jansen, 2013, Weak top-down constraints for unsupervised acoustic model training
Jansen, 2011, Efficient spoken term discovery using randomized algorithms
Kamper, 2015, Unsupervised neural network based feature extraction using weak top-down constraints
Kamper, 2015, Fully unsupervised small-vocabulary speech recognition using a segmental Bayesian model
Kamper, 2016, Unsupervised word segmentation and lexicon discovery using acoustic word embeddings, IEEE/ACM Trans. Audio Speech Lang. Process., 24, 669, 10.1109/TASLP.2016.2517567
Kamper, 2014, Unsupervised lexical clustering of speech segments using fixed-dimensional acoustic embeddings
Kamper, 2016, Deep convolutional acoustic word embeddings using word-pair side information
Lee, 2012, A nonparametric Bayesian approach to acoustic model discovery
Lee, 2015, Unsupervised lexicon discovery from acoustic input, Trans. ACL, 3, 389
Lee, 2013, Enhanced spoken term detection using support vector machines and weighted pseudo examples, IEEE Trans. Audio Speech Lang. Process., 21, 1272, 10.1109/TASL.2013.2248721
Levin, 2013, Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings
Levin, 2015, Segmental acoustic indexing for zero resource keyword search
Ludusan, 2014, Bridging the gap between speech technology and natural language processing: an evaluation toolbox for term discovery systems
Lyzinski, 2015, An evaluation of graph clustering methods for unsupervised term discovery
Martin, 2015, Utterance classification in speech-to-speech translation for zero-resource languages in the hospital administration domain
McQueen, 1998, Segmentation of continuous speech using phonotactics, J. Mem. Lang., 39, 21, 10.1006/jmla.1998.2568
Mochihashi, 2009, Bayesian unsupervised word segmentation with nested Pitman-Yor language modeling
Murphy, K. P., 2007. Conjugate Bayesian analysis of the Gaussian distribution. URL: http://www.cs.ubc.ca/~murphyk/mypapers.html.
Murphy, 2012
Neubig, 2010, Learning a language model from continuous speech
Park, 2008, Unsupervised pattern discovery in speech, IEEE Trans. Audio Speech Lang. Process., 16, 186, 10.1109/TASL.2007.909282
Pitt, 2005, The Buckeye corpus of conversational speech: labeling conventions and a test of transcriber reliability, Speech Commun., 45, 89, 10.1016/j.specom.2004.09.001
Räsänen, 2012, Computational modeling of phonetic and lexical learning in early language acquisition: existing models and future directions, Speech Commun., 54, 975, 10.1016/j.specom.2012.05.001
Räsänen, 2015, Unsupervised word discovery from speech using automatic segmentation into syllable-like units
Räsänen, 2017, Pre-linguistic rhythmic segmentation of speech into syllabic units, Completed for submission
Renshaw, 2015, A comparison of neural network methods for unsupervised representation learning on the Zero Resource Speech Challenge
Resnik, 2010, Gibbs sampling for the uninitiated
Scott, 2002, Bayesian methods for hidden Markov models, J. Am. Stat. Assoc., 97, 337, 10.1198/016214502753479464
Shum, 2016, On the use of acoustic unit discovery for language recognition, IEEE Trans. Acoust. Speech Signal Process., 24, 1665
Siu, 2014, Unsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery, Comput. Speech Lang., 28, 210, 10.1016/j.csl.2013.05.002
Sun, 2013, Joint training of non-negative Tucker decomposition and discrete density hidden Markov models, Comput. Speech Lang., 27, 969, 10.1016/j.csl.2012.09.006
Synnaeve, 2014, Phonetics embedding learning with side information
Taniguchi, 2016, Symbol emergence in robotics: a survey, Adv. Robotics, 30, 706, 10.1080/01691864.2016.1164622
Thiollière, 2015, A hybrid dynamic time warping-deep neural network architecture for unsupervised acoustic modeling
Varadarajan, 2008, Unsupervised learning of acoustic sub-word units
Versteegh, 2016, The zero resource speech challenge 2015: proposed approaches and results
Versteegh, 2015, The Zero Resource Speech Challenge 2015
Walter, 2013, A hierarchical system for word discovery exploiting DTW-based initialization
Wilkinson, 2016, Deriving phonetic transcriptions and discovering word segmentations for speech-to-speech translation in low-resource settings, 10.21437/Interspeech.2016-1319
Zeghidour, 2016, Joint learning of speaker and phonetic similarities with Siamese networks, 10.21437/Interspeech.2016-811
Zeghidour, 2016, A deep scattering spectrum-deep Siamese network pipeline for unsupervised acoustic modeling
Zeiler, 2013, On rectified linear units for speech processing
Zhang, 2010, Towards multi-speaker unsupervised speech pattern discovery
Zhang, 2012, Resource configurable spoken query detection using deep Boltzmann machines
Zweig, 2010, SCARF: a segmental conditional random field toolkit for speech recognition
