DEVELOPMENT OF HIGH-PERFORMANCE AND LARGE-SCALE VIETNAMESE AUTOMATIC SPEECH RECOGNITION SYSTEMS
Tóm tắt
Từ khóa
#Speech recognition #Vietnamese #speech corpusTài liệu tham khảo
S. F. Chen and J. Goodman, “An empirical study of smoothing techniques for language model-
ing,” in Proceedings of ACL, 1996, pp. 310–318.
M. Chu, Co so ngon ngu hoc va tieng Viet.
NXB Giao Duc, 1997.
S. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word
recognition in continuously spoken sentences,” IEEE, pp. 357–366, 1980.
N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front end factor analysis for
speaker verification,” IEEE, 2010.
M. Gibson and T. Hain, “Hypothesis spaces for minimum bayes risk training in large vocabulary
speech recognition.” in Proceedings of INTERSPEECH, 2006.
G. Hinton, “A practical guide to training restricted boltzmann machines,” Momentum, vol. 9,
no. 1, p. 926, 2010.
P. Kenny, G. Boulianne, and P. Dumouchel, “Eigenvoice modeling with sparse training data,”
IEEE, vol. 13, no. 3, pp. 345–354, May 2005.
P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, “Joint factor analysis versus eigenchan-
nels in speaker recognition,” IEEE, vol. 15, no. 4, pp. 1435–1447, May 2007.
Q. Nguyen, T. Vu, and C. Luong, “Improving acoustic model for vietnamese large vocabulary
continuous speech recognition system using tonal feature as input of deep neural network,”
Journal of Computer Science and Cybernetics, vol. 30, pp. 28–38, 2014.
V. Nguyen, C. Luong, T. Vu, and Q. Do, “Vietnamese recognition using tonal phoneme based
on multi space distribution,” Journal of Computer Science and Cybernetics, vol. 30, pp. 28–38,
D. Povey, A. Ghoshal, G. Boulianne, N. Goel, M. Hannemann, Y. Qian, P. Schwarz, and G. Stem-
mer, “The Kaldi speech recognition toolkit,” in Proceedings of IEEE workshop, 2011.
D. Povey and B. Kingsbury, “Evaluation of proposed modifications to MPE for large scale
discriminative training,” in Proceedings of ICASSP, vol. 4, April 2007, pp. IV–321–IV–324.
A. Stolcke, “SRILM – an extensible language modeling toolkit,” in Proceedings of ICSLP, vol. 2,
Denver, USA, 2002, pp. 901–904.
V. Thang, L. C. Mai, and S. Nakamura, “An hmm-based vietnamese speech synthesis system,”
in Proceedings of O-COCOSDA, 2009.
T. Vu, T. Nguyen, C. Luong, and J. Hosom, “Vietnamese large vocabulary continuous speech
recognition,” in Interspeech, 2005.
X. Z., J. Trmal, D. Povey, and S. Khudanpur, “Improving deep neural network acoustic mo