Cognitive factors in the evaluation of synthetic speech

Speech Communication - Tập 24 - Trang 153-168 - 1998
Cristina Delogu1, Stella Conte2, Ciro Sementina3
1Voice Communication Group, Multimedia Communication Division, Fondazione Ugo Bordoni, Via B. Castiglione, 59 00142 Rome, Italy
2Dipartimento di Psicologia, Universita' di Palermo, Palermo, Italy
3Dipartimento di Psicologia, Universita' di Roma, Rome, Italy

Tài liệu tham khảo

Abrams, 1969, Syntactic structure modifies attention during speech perception and recognition, Quart. J. Exp. Psychol., 21, 280, 10.1080/14640746908400223 Allen, 1992, Overview of Text-to-Speech Systems, 741 Bendat, 1972 Benoit, 1990, An intelligibility test using semantically unpredictable sentences: Towards the quantification of linguistic complexity, Speech Communication, 9, 293, 10.1016/0167-6393(90)90005-T Boogaart, 1992, Evaluating the overall comprehensibility of speech synthesizers, Vol. 2, 1207 Carlson, 1989, Evaluation and development of the KTH Text-To-Speech system on the segmental level, 1.3.1 Carlson, 1992, The listening speed paradigm for synthesis evaluation, 63 Damos, 1985, The effect of asymmetric transfer and speech technology on dual task performance, Human Factors, 27, 409, 10.1177/001872088502700405 Delogu, 1991, New directions in the evaluation of voice input/output systems, IEEE J. Select. Areas Commun., 9, 10.1109/49.81950 Delogu, 1991, Quality evaluation of text-to-speech synthesizers using magnitude estimation, categorical estimation, pair comparison, and reaction time methods, 353 Delogu, 1992, Comprehension of natural and synthetic speech: A preliminary investigation, 109 Delogu, 1992, Two different methodologies for evaluating the comprehension of synthetic passages, 1231 Delogu, 1993, A methodology for evaluating human-machine spoken language interaction, Vol. 2, 1427 Delogu, 1995, Intelligibility of speech produced by text-to-speech systems in good and telephonic conditions, Acta Acustica, 3, 89 ETSI (European Telecommunications Standards Institute). Technical Report (1993), Human Factors (HF), Guide for usability evaluations, ETR 095. Foss, 1970, Some effects on ambiguity upon sentence comprehension, J. Verbal Learning Verbal Behav., 9, 699, 10.1016/S0022-5371(70)80035-4 Gleiss, 1992, Usability — Concepts and Evaluation, Tele, 2 Goldstein, 1995, Classification of methods used for assessment of text-to-speech systems according to the demands placed on the listener, Speech Communication, 16, 225, 10.1016/0167-6393(94)00047-E Grice, 1991, Assessment of intonation in text-to-speech synthesis systems — a pilot test in English and Italian, Vol. 2, 879 Hazan, 1992, Quantification of listener variability, Esprit Project 2589 (SAM), Interim Report House, 1965, Articulation-testing methods: Consonantal differentiation with a closed-response set, J. Acoust. Soc. Amer., 37, 158, 10.1121/1.1909295 Howard-Jones, 1992, Specification of listener dimensions, Esprit Project 2589 (SAM), Interim Report Jerison, 1970, Vigilance, Discrimination, and Attention, 127 Jekosch, 1992, The cluster-identification test, Vol. I, 205 Jekosch, 1993, Speech quality assessment and evaluation, Vol. 2, 1387 Jekosch, 1994, Speech intelligibility testing: On the interpretation of results, J. Amer. Voice Input/Output Soc., 15, 63 Levelt, 1978, A survey of studies in sentence perception Lovett Doust, 1978, An ultradian rhythm of reaction time measurements in man, Neuropsychobiology, 4, 93, 10.1159/000117623 Luce, 1983, Capacity demands in short-term memory for synthetic and natural speech, Human Factors, 25, 17, 10.1177/001872088302500102 Malsheen, 1990, The interrelationship of intelligibility and naturalness in text-to-speech, 333 Marics, 1988, The intelligibility of synthesized speech in data inquiry systems, Human Factors, 30, 719, 10.1177/001872088803000608 Marslen-Wilson, 1980, The temporal structure of spoken language understanding, Cognition, 8, 1, 10.1016/0010-0277(80)90015-3 Miller, 1963, Some perceptual consequences of linguistic rules, J. Verbal Learning Behav., 2, 217, 10.1016/S0022-5371(63)80087-0 Moray, 1967, Where is capacity limited? A survey and a model, Acta Psychol., 27, 84, 10.1016/0001-6918(67)90048-0 Neisser, 1967 Neovius, 1993, Comprehension of KTH text-tospeech with listening speed paradigm, 1687 Nooteboom, 1977, Contributions of prosody to speech perception, 75 Nusbaum, 1990, The role of learning and attention in speech perception, 409 Nusbaum, 1995, Measuring the Naturalness of Synthetic Speech, Internat. J. Speech Technol., 1, 7, 10.1007/BF02277176 Nye, 1975, Synthetic Speech Comprehension: A comparison of listener performances with and preferences among different speech forms, Haskins Laboratories: Status Report on Speech research, SR41 Parasuraman, 1979, Memory load and event rate control sensitivity decrements in sustained attention, Science, 205, 924, 10.1126/science.472714 Pavlovic, 1990, Use of the magnitude estimation technique for assessing the performance of text-to-speech synthesis systems, J. Acoust. Soc. Amer., 87, 373, 10.1121/1.399258 Pisoni, 1981, Some current theoretical issues in speech perception, Cognition, 10, 249, 10.1016/0010-0277(81)90054-8 Pols, 1992, Quality assessment of text-to-speech synthesis by rule 1992, Multilingual synthesis evaluation method, Vol. 1, 181 Ralston, 1989, Comprehension of synthetic speech produced by rule Ralston, 1990, Measuring the workload of comprehending spoken discourse: A first report Ralston, 1991, Comprehension of synthetic speech produced by rule: Word monitoring and sentence-by-sentence listening time, Human Factors, 33, 471, 10.1177/001872089103300408 Rosson, 1985, Listener training for speech output applications, 193 Salza, 1993, Development of a context dependent methodology for Text-to-Speech Synthesis evaluation in Interactive Dialogue Systems, Esprit Project 6819 (SAM-A), Speech Technology Assessment in Multilingual Applications, SAM-A Periodic Progress Report Year 1, 1 April 1993-30 September 1993 Samuels, 1987, Factors that influence listening and reading comprehension, 295 Schwab, 1985, Some effects of training on the perception of synthetic speech, Human Factors, 27, 395, 10.1177/001872088502700404 Silverman, 1990, Evaluating synthesizers performance: Is intelligibility enough?, 981 Slowiaczek, 1985, Effects of speech rate and pitch contour on the perception of synthetic speech, Human Factors, 27, 701, 10.1177/001872088502700609 Spiegel, 1990, Comprehensive assessment of the telephone intelligibility of synthetic and natural speech, Speech Communication, 9, 279, 10.1016/0167-6393(90)90004-S Sydeserff, 1991, Evaluation of synthetic speech techniques in a comprehension task, Vol. 1, 277 Treisman, 1969, Strategies and models of selective attention, Psychological Rev., 76, 282, 10.1037/h0027242 Van Bezooijen, 1990, Sam Segmental Test. Esprit Project 2589 (Sam) Van Santen, 1993, Perceptual experiments for diagnostic testing of text-to-speech systems, Computer Speech and Language, 7, 49, 10.1006/csla.1993.1004 Voiers, 1983, Evaluating processed speech using the Diagnostic Rhyme Test, Speech Technol., 1, 30 Warm, 1984, An introduction to vigilance, 1 Waterworth, 1987 Wickens, 1981, Multiple resources, task-hemispheric integrity and individual differences in time sharing, Human Factors, 23, 211, 10.1177/001872088102300209