The symbiosis of DSP and speech recognition or an outsider's view of the inside

J.F. Kaiser1
1Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA

Tóm tắt

From an historical review of how we got to where we are now, we discuss the interrelationship between our system design objectives and goals, our modeling of the speech signal and its generation and parameterization, and the broadly developing DSP methodology. We take a critical look at some of the underlying assumptions in. our modeling to see if they may be limiting the performance that can be obtained with ASR (automatic speech recognition) systems. We close with some open questions and challenges for new work.

Từ khóa

#Symbiosis #Digital signal processing #Speech recognition #Automatic speech recognition #Telephony #Speech synthesis #Physics #Laboratories #Speech coding #Mathematical model

Tài liệu tham khảo

teager, 1983, A Phenomenological Model for Vowel Production in the Vocal Tract, Speech Sciences Recent Advances, 73 teager, 1989, Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract, Speech Production and Speech Modelling, 17 10.1016/1044-5765(92)90048-7 guernsey, 2001, Software is Called Capable of Copying Any Human Voice, N Y Times, 1 petroski, 1998, Invention by Design fletcher, 1973, The Man Who Walked Through Time 10.1121/1.1906681 chiba, 1941, The Vowel, Its Nature and Structure 10.1007/978-3-662-00849-2 10.1121/1.1906875 nebeker, 1998, Signal Processing: The Emergence of a Discipline 1948 to 1998 fant, 1972, Speech Sounds & Features fletcher, 0, Speech and Hearing in Communication, ASA 1995 reprint of 1949 edition which is an integrated version of Fletcher's Speech and Hearing 1929 10.1121/1.1912988 hyde, 1979, Automatic Speech Recognition: A Critical Survey and Discussion of the Literature, Human Communication A Unified View, 399