Speech quality assessment using 2D neurogram orthogonal moments
Tài liệu tham khảo
Beerends, 2013, Perceptual objective listening quality assessment (POLQA), the third generation ITU-T standard for end-to-end speech quality measurement Part I: temporal alignment, J. Audio Eng. Soc., 61, 366
Bruce, 2003, An auditory-periphery model of the effects of acoustic trauma on auditory nerve responses, J. Acoust. Soc. Am., 113, 369, 10.1121/1.1519544
Côté, 2011, Integral and diagnostic intrusive prediction of speech quality, 10.1007/978-3-642-18463-5
Dubno, 2005, Word recognition in noise at higher-than-normal levels: decreases in scores and increases in masking, J. Acoust. Soc. Am., 118, 914, 10.1121/1.1953107
Flusser, 2009
Hines, 2010, Speech intelligibility from image processing, Speech Commun., 52, 736, 10.1016/j.specom.2010.04.006
Hines, 2015, ViSQOL: an objective speech quality model, EURASIP J. Audio Speech Music Process., 2015, 1, 10.1186/s13636-015-0054-9
Hu, 2006, Subjective comparison of speech enhancement algorithms, 1, I
Hu, 2008, Evaluation of objective quality measures for speech enhancement, IEEE Trans. Audio Speech Lang. Process., 16, 229, 10.1109/TASL.2007.911054
Hu, 2007, Subjective comparison and evaluation of speech enhancement algorithms, Speech Commun., 49, 588, 10.1016/j.specom.2006.12.006
Huber, 2006, PEMO-Q a new method for objective audio quality assessment using a model of auditory perception, IEEE Trans. Audio Speech Lang. Process., 14, 1902, 10.1109/TASL.2006.883259
ITU-T, 2014. Perceptual Objective Listening Quality Assessment, Recommendation ITU-T P.863.
ITU-T-Recommendations, 2012. G.729 : Coding of Speech at 8 kbit/s Using Conjugate-structure Algebraic-code-excited Linear Prediction (CS-ACELP).
ITU-T Study Group 12: Speech Quality Experts Group, 1995. Subjective Test Plan for Characterization of an 8 kbit/s Speech Codec.
ITU-T recommendation P.862 Perceptual Evaluation of Speech Quality (PESQ): An Objective Method for End-to-end Speech Quality Assessment of Narrow-band Telephone Networks and Speech Codecs, 2001.
Jassim, 2012, Face recognition using discrete Tchebichef–Krawtchouk transform, 120
Jassim, 2012, New orthogonal polynomials for speech signal and image processing, IET Signal Process., 6, 713, 10.1049/iet-spr.2011.0004
Kates, 2010, The hearing-aid speech quality index (HASQI), Audio Eng. Soc., 58, 363
sheng Kiang, 1990, Curious oddments of auditory-nerve studies, Hear. Res., 49, 1, 10.1016/0378-5955(90)90091-3
Klatt, 1982, Prediction of perceived phonetic distance from critical-band spectra: a first step, 7, 1278
Koekoek, 2010, Hypergeometric orthogonal polynomials and their q-analogues
Kressner, 2013, Evaluating the generalization of the hearing aid speech quality index (HASQI), IEEE Trans. Audio Speech Lang. Process., 21, 407, 10.1109/TASL.2012.2217132
Loizou, 2011, Speech quality assessment, 346, 623
Loizou, 2013
Mamun, 2015, Prediction of speech intelligibility using a neurogram orthogonal polynomial measure (NOPM), IEEE Trans. Audio Speech Lang. Process., 23, 760, 10.1109/TASLP.2015.2401513
Panzer, 1993, A comparison of subjective methods for evaluating speech quality, 224, 59
Pearce, 2000, The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, 29
Preminger, 1995, Quantifying the relation between speech quality and speech intelligibility, J. Speech Lang. Hear. Res., 38, 714, 10.1044/jshr.3803.714
Quackenbush, 1988, Objective measures of speech quality
Rix, 2001, Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs, 2, 749
Rothauser, 1969, IEEE recommended practice for speech quality measurements, IEEE Trans. Audio Electroacoust., 17, 225, 10.1109/TAU.1969.1162058
Smith, 2007
Steeneken, 1992
Studebaker, 1999, Monosyllabic word recognition at higher-than-normal speech and noise levels, J. Acoust. Soc. Am., 105, 2431, 10.1121/1.426848
Supplement 23 to ITU-T P-series recommendations ITU-T Coded-speech Database. 1998.
Teng, 2006
Tribolet, 1978, A study of complexity and quality of speech waveform coders, 3, 586
Wee, 2010, Image quality assessment by discrete orthogonal moments, Pattern Recognit., 43, 4055, 10.1016/j.patcog.2010.05.026
Wong, 1998, Effects of high sound levels on responses to the vowel /ε/ in cat auditory nerve, Hear. Res., 123, 61, 10.1016/S0378-5955(98)00098-7
Zilany, 2006, Modeling auditory-nerve responses for high sound pressure levels in the normal and impaired auditory periphery, J. Acoust. Soc. Am., 120, 1446, 10.1121/1.2225512
Zilany, 2007, Representation of the vowel /ε/ in normal and impaired auditory nerve fibers: model predictions of responses in cats, J. Acoust. Soc. Am., 122, 402, 10.1121/1.2735117
Zilany, 2014, Updated parameters and expanded simulation options for a model of the auditory periphery, J. Acoust. Soc. Am., 135, 283, 10.1121/1.4837815
Zilany, 2009, A phenomenological model of the synapse between the inner hair cell and auditory nerve: Long-term adaptation with power-law dynamics, J. Acoust. Soc. Am., 126, 2390, 10.1121/1.3238250