Evaluating embodied conversational agents in multimodal interfaces
Tóm tắt
Based on cross-disciplinary approaches to Embodied Conversational Agents, evaluation methods for such human-computer interfaces are structured and presented. An introductory systematisation of evaluation topics from a conversational perspective is followed by an explanation of social-psychological phenomena studied in interaction with Embodied Conversational Agents, and how these can be used for evaluation purposes. Major evaluation concepts and appropriate assessment instruments – established and new ones – are presented, including questionnaires, annotations and log-files. An exemplary evaluation and guidelines provide hands-on information on planning and preparing such endeavours.
Tài liệu tham khảo
Adcock, AB, & Eck, RNV (2005). Reliability and factor structure of the attitude toward tutoring agent scale (ATTAS). Journal of Interactive Learning Research, 16(2), 195–217.
Anderson, JR, & Lebiere, C. (1998). The Atomic Components of Thought. Hillsdale: Lawrence Erlbaum Associates.
Andrews, FM, & Whitey, SB. (1976). Social Indicators of Well-being. Americans Perception of Life Quality. New York: Plenum Press.
Bargas-Avila, JA, & Hornbæk, K (2011). Old wine in new bottles or novel challenges? A critical analysis of empirical studies of user experience. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI), (pp. 2689–2698).
Baylor, A, & Ryu, J (2003). The API (agent persona instrument) for assessing pedagogical agent persona. In Proc. World Conference on Educational Multimedia, Hypermedia and Telecommunications (EDMEDIA), (pp. 448–451).
Beringer, N, Kartal, U, Louka, K, Schiel, F, Türk, U (1997). PROMISE: A procedure for multimodal interactive system evaluation. In Proc. Workshop on Multimodal Resources and Multimodal Systems Evaluation, (pp. 77–80).
Bernsen, NO, & Dybkjær, L. (2009). Multimodal Usability. Human-Computer Interaction Series. London: Springer.
Bless, H, Clore, GL, Schwarz, N, Golisano, V, Rabe, C, Wölk, M (1996). Mood and the use of scripts: Does a happy mood really lead to mindlessness?Journal of Personality and Social Psychology, 71(4), 665–679.
Bradley, MM, & Lang, PJ (1994). Measuring emotion: the self-assessment manikin and the semantic differential. Journal of Behavior Therapy and Experimental Psychiatry, 25(1), 49–59.
Brave, S, & Nass, C (2007). Emotion in human-computer interaction. In: Sears, A, & Jacko, J (Eds.) In The Human-Computer Interaction Handbook. Fundamentals, Evolving Technologies and Emerging Applications. 2nd edn. Lawrence Erlbaum.
Brennan, SA (1998). The grounding problem in conversations with and through computers. In: Fussell, SR, & Kreuz, J (Eds.) In Social and Cognitive Psychological Approaches to Interpersonal Communication. Erlbaum, Hillsdale.
Brooke, J (1996). SUS: a “quick and dirty” usability scale. In: Jordan, PW, Thomas, B, Weerdmeester, BA, McClelland, AL (Eds.) In Usability Evaluation in Industry. Taylor and Francis, London.
Burmester, M, Mast, M, Jäger, K, Homans, H (2010). Valence method for formative evaluation of user experience. In Proc. ACM Conf. on Designing Interactive Systems (DIS), (pp. 364–367).
Card, SK, Moran, TP, Newell, A. (1983). The Psychology of Human-Computer Interaction. London: Lawrence Erlbaum Associates.
Carmichael, A. (1999). Style Guide for the Design of Interactive Television Services for Elderly Viewers. Winchester, UK: Independent Television Commission.
Chalmers, PA (2003). The role of cognitive theory in human-computer interface. Computers in Human Behavior, 19(5), 593–607.
Clark, HH. (1996). Using Language: Cambridge University Press.
Cordes, RE (2001). Task-selection bias: A case for user-defined tasks. International Journal of Human Computer Interaction, 13(4), 411–419.
Costa, PTJ, & McCrae, RR. (1992). NEO PI-R Professional Manual. Odessa: Psychological Assessment Resources, Inc.
Dehn, DM, & van Mulken, S (2000). The impact of animated interface agents: a review of empirical research. International Journal of Human-Computer Studies, 52, 1–22.
Desmet, PMA (2004). From disgust to desire: How products elicit emotions. In: Hekkert, DC, & McDonagh van Erp, J (Eds.) In Proc. Int. Conf on Design and Emotion.
Desmet, PMA, & Hekkert, P (2007). Framework of product experiences. International Journal of Design, 1, 57–66.
Dix, A, Finlay, J, Abowd, G, Beale, R. (1993). Human-Computer Interaction: Prentice Hall.
Dix, A, Finlay, J, Abowd, G, Beale, R. (2003). Human-Computer Interaction, 3rd edn: Prentice Hall.
Dohen, M (2009). Speech through the ear, the eye, the mouth and the hand. In: Esposito, A, Hussain, A, Marinaro, M (Eds.) In Multimodal Signals: Cognitive and Algorithmic Issues. Springer, Berlin.
Duffy, BR (2003). Anthropomorphism and the social robot. Robotics and Autonomous Systems, 42, 177–190.
Dybkjær, L, Bernsen, NO, Minker, W (2004). Evaluation and usability of multimodal spoken language dialogue systems. Speech Communication, 43, 33–54.
Engelbrecht, KP, Quade, M, Möller, S (2009). Analysis of a new simulation approach to dialogue system evaluation. Speech Communication, 51(12), 1234–1252.
Epstein, S (1994). Integration of the cognitive and the psychodynamic unconscious. American Psychologist, 49, 709–724.
Ericsson, K, & Simon, H (1980). Verbal reports as data. Psychological Review, 87, 215–251.
Fisher, AG. (1996). Assessment of Motor and Process Skills (AMPS). Vol. 2: User Manual, 5th edn. Fort Collins: Three Star Press.
Forlizzi, J, & Battarbee, K (2004). Understanding experience. In Proc. ACM Conf. on Designing Interactive Systems (DIS), (pp. 261–268).
Foster, MA (2007). Enhancing human-computer interaction with embodied conversational agents. In Proc. Int. Conf. on Universal Access in Human-computer Interaction: Ambient Interaction, (pp. 828–837).
Foster, MA, Giuliani, M, Knoll, A (2009). Comparing objective and subjective measures of usability in a human-robot dialogue system. In Proc. of Annual Meeting of the ACL Joint with the Int. Conf. on Natural Language Processing, (pp. 879–887).
Fraser, N (1997). Assessment of interactive systems. In: Gibbon, D, Moore, R, Winski, R (Eds.) In Handbook on Standards and Resources for Spoken Language Systems. Mouton de Gruyter, Berlin.
Gediga, G, Hamborg, KC, Düntsch, I (1999). The IsoMetrics usability inventory: An operationalisation of ISO 9241-10. Behaviour and Information Technology, 18, 151–164.
Gerrig, RJ, & Zimbardo, PG (Eds.) (2007). Psychology and Life, 18 edn. Essex: Pearson.
Gibbon, D, Mertins, I, Moore, R (Eds.) (2000). Handbook of Multimodal and Spoken Dialogue Systems: Resources, Terminology and Product Evaluation. Norwell: Kluwer Academic Publishers.
Grudin, J (1992). Utility and usability: Research issues and development contexts. Interacting with Computers, 4, 209–217.
Guerin, B. (1993). Social Facilitation: Cambridge University Press.
Hassenzahl, M (2003). The thing and I: Understanding the relationship between user and product. In: Blythe, MA, Overbeeke, K, Monk, AF, Wright, PC (Eds.) In Funology. From Usability to Enjoyment. Kluwer, Dordrecht.
Hassenzahl, M (2008). User experience (UX): Towards an experiential perspective on product quality. In Proc. Int. Conf. of the Association Francophone d’Interaction Homme-Machine.
Hassenzahl, M, & Sandweg, N (2004). From mental effort to perceived usability: Transforming experiences into summary assessments. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI), (pp. 1283–1286).
Hassenzahl, M, & Tractinsky, N (2006). User experience – a research agenda. Behaviour and Information Technology, 25, 91–97.
Hassenzahl, M, & Ullrich, D (2007). To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals. Interacting with Computers, 19, 429–437.
Hassenzahl, M, Burmester, M, Koller, F (2003). AttrakDiff: Ein Fragebogen zur Messung wahrgenommener hedonischer und pragmatischer Qualität. In Proc. Mensch and Computer. Interaktion in Bewegung, (pp. 187–196).
Hassenzahl, M, Platz, A, Burmester, M, Lehner, K (2000). Hedonic and ergonomic quality aspects determine a software’s appeal. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI), (pp. 201–208).
Hassenzahl, M, Diefenbach, S, Göritz, A (2010). Needs, affect, and interactive products - facets of user experience. Interacting with Computers, 22, 353–362.
Hekkert, P (2006). Design aesthetics: Principles of pleasure in product design. Psychology Science, 48, 157–172.
Holmqvist, K, Nyström, M, Andersson, R, Dewhurst, R, Jarodzka, H, van de Weijer J. (2011). Eye Tracking: A Comprehensive Guide to Methods and Measures: Oxford University Press.
Holzinger, A (2005). Usability engineering for software developers. Communications of the ACM, 48(1), 71–74.
Hone, K (2006). Animated agents to reduce user frustration: the effects of varying agent characteristics. Interacting with Computers, 18(2), 227–245.
Hone, KS, & Graham, R (2000). Towards a tool for the subjective assessment of speech system interfaces (sassi). Natural Language Engineering, 6, 287–303.
Huisman, G, & van Hout, M (2008). The development of a graphical emotion measurement instrument using caricatured expressions: the LEMtool. In Proc. Int. Workshop Human-Computer Interaction, (pp. 5–8).
ISO 24617-2. (2012). Language resource management – Semantic annotation framework (SemAF), Part 2: Dialogue acts. Geneva: ISO.
ISO 9421-11 (1998). Ergonomic requirements for office work with visual display terminals (VDTs), Part 11: Guidance on usability specification and measures.
ISO 9421-210 (2010). Ergonomics of human system interaction, Part 210: Human-centred design for interactive systems.
John, OP, Donahue, EM, Kentle, RL. (1991). The Big Five Inventory–Versions 4a and 54. Berkeley: University of California.
Jordan, PW. (2000). Designing Pleasurable Products. London: Taylor and Francis.
Kahneman, D (2003a). Objective happiness. In: Kahneman, D, Diener, E, Schwarz, N (Eds.) In Well-being: Foundations of Hedonic Psychology. Russell Sage, New York.
Kahneman, D (2003b). A psychological perspective on economics. American Economic Review, 93(2), 162–168.
Kahneman, D, & Frederick, S (2002). Representativeness revisited: Attribute substitution in intuitive judgment. In: Gilovich, T, Griffin, D, Kahneman, D (Eds.) In Heuristics and Biases: The Psychology of Intuitive Judgment. Cambridge University Press, New York.
Kahneman, D, Fredrickson, BL, Schreiber, CA, Redelmeier, DA (1993). When more pain is preferred to less: Adding a better end. Psychological Science, 4, 401–405.
Kaptelinin, V, Nardi, BA, Bödker, S, Carroll, JM, Hollan, JD, Hutchins, E, Winograd, T (2003). Post-cognitivist hci: second-wave theories. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI). Extended Abstracts.
Karacora, B, Dehghani, M, Krämer-Mertens, N, Gratch, J (2012). The influence of virtual agents’ gender and rapport on enhancing math performance. In Proc: COGSCI, (pp. 563–568).
Karrer, K, Glaser, C, Clemens, C, Bruder, C (2009). Technikaffinität erfassen – der Fragebogen TA-EG. In Proc. 8. Berliner Werkstatt Mensch-Maschine-Systeme, (pp. 196–201).
Kelly, GA. (1955). The Psychology of Personal Constructs. New York: Norton.
Kieras, DE (2003). Model-based evaluation. In: Jacko, JA, & Sears, A (Eds.) In The Human-computer Interaction Handbook. Lawrence Erlbaum Associates, Mahwah.
Kieras, DE, & Polson, PG (1985). An approach to the formal analysis of user complexity. International Journal of Man-Machine Studies, 22, 365–394.
Kirakowski, J, & Corbett, M (1993). SUMI – the software usability measurement inventory. British Journal of Educational Technology, 24(3), 210–212.
Koda, T, & Maes, P (1996). Agents with faces: the effect of personification. In Proc. IEEE International Workshop on Robot and Human Communication, (pp. 189–194).
Krauss, RM, & Fussell, SR (1996). Social psychological models of interpersonal communication. In: Higgins, ET, & Kruglanski, A (Eds.) In Social Psychology: A Handbook of Basic Principles. Guilford, New York.
Kühnel, C. (2012). Quantifying Quality Aspects of Multimodal Interactive Systems. T-Labs Series in Telecommunication Services. Berlin: Springer.
Kujala, S, Roto, V, Väänänen-Vainio-Mattila, K, Karapanos, E, Sinnelä, A (2011). UX curve: A method for evaluating long-term user experience. Interacting with Computers, 23(5), 473–483.
Landauer, TK. (1995). The Trouble with Computers: Usefulness, Usability, and Productivity. Cambridge, USA: MIT Press.
Larsen, LB (2003). Assessment of spoken dialogue system usability – what are we really measuring? In Proc: EUROSPEECH, (pp. 1945–1948).
Lavie, T, & Tractinsky, N (2004). Assessing dimensions of perceived visual aesthetics of web sites. International Journal of Human-Computer Studies, 60(3), 269–298.
Law, E, Roto, V, Vermeeren, A, Korte, J, Hassenzahl, M (2008). Towards a shared definition of user experience. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI), (pp. 2395–2398).
Law, E, Roto, V, Hassenzahl, M, Vermeeren, A, Korte, J (2009). Understanding, scoping and defining user experience: a survey approach. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI), (pp. 719–728).
Le Callet, P, Möller, S, Perkis, A. (2012). Qualinet White Paper on Definitions of Quality of Experience, Version 1.1. Lausanne, Switzerland: European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003).
Lee, L, Amir, O, Ariely, D (2009). In search of homo economicus: Cognitive noise and the role of emotion in preference consistency. Journal of Consumer Research, 36, 173–187.
Lewis, JR (1995). IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use. International Journal of Human-Computer Interaction, 7(1), 57–78.
Lewis, JR (2012). Usability testing. In: Salvendy, G (Ed.) In Handbook of Human Factors and Ergonomics. 4th edn. John Wiley, New York, (pp. 1267–1312).
Lin, HX, Choong, YY, Salvendy, G (1997). A proposed index of usability: a method for comparing the relative usability of different software systems. Behaviour and Information Technology, 16, 267–278.
Lindgaard, G. (1994). Usability Testing and System Evaluation: A Guide for Designing Useful Computer Systems. London: Chapman and Hall.
López-Cózar, López-Cózar Delgado, R, Araki, M. (2005). Spoken, Multilingual and Multimodal Dialogue Systems: Development and Assessment. Chinchester: Wiley.
Marakas, GM, Johnson, RD, Palmer, JW (2000). A theoretical model of differential social attributions toward computing technology: when the metaphor becomes the model. International Journal of Human-Computer Studies, 52, 719–750.
Mayer, JD, & Gaschke, YN (1988). The experience and meta-experience of mood. Journal of Personality and Social Psychology, 55, 102–111.
McCarthy, J, & Wright, P. (2004). Technology as Experience: MIT Press.
Molich, R, & Dumas, JS (2008). Comparative usability evaluation (cue-4). Behaviour and Information Technology, 27(3), 263–281.
Möller, S. (2005). Quality of Telephone-based Spoken Dialogue Systems. New York: Springer.
Moraes, MC, & Silveira, MS (2006). How am i? guidelines for animated interface agents evaluation. In Proc. IEEE/WIC/ACM Intern. Conf. on Intelligent Agent Technology (IAT), (pp. 200–2003).
Moraes, MC, & Silveira, MS (2009). Design guidelines for animated pedagogical agents. In Proc. IFIP World Conf. on Computers in Education (WCCE), (pp. 1–10).
Mori, M (1970). Bukimi no tani (the uncanny valley). Energy, 7, 33–35.
Morris, WN. (2005). Mood: The Frame of Mind. New York: Springer.
Nass, C, Isbister, K, Lee, EJ (2001). Truth is beauty: Researching conversational agents. In: Cassell, J (Ed.) In Embodied Conversational Agents. MIT Press, Cambridge, Massachusetts.
Naumann, A, & Wechsung, I (2008). Developing usability methods for multimodal systems: The use of subjective and objective measures. In Proc. Int. Workshop on Meaningful Measures: Valid Useful User Experience Measurement (VUUM), (pp. 8–12).
Nielsen, J (1994). Heuristic evaluation. In: Nielsen, J, & Mack, RL (Eds.) In Usability Inspection Methods. John Wiley and Sons, New Work.
Nielsen, J, & Molich, R (1990). Heuristic evaluation of user interfaces. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI), (pp. 249–256).
Noor, C (2004). Empirical evaluation methodology for embodied conversational agents: On conducting evaluation studies. In: Ruttkay, Z, & Pelachaud, C (Eds.) In From Brows to Trust: Evaluating Embodied Conversational Agents. Kluwer, Dordrecht.
Norman, D. (2004). Emotional Design: Why We Love (or Hate) Everyday Things. New York: Basic Books.
Norman, D, Miller, J, Henderson, A (1995). What you see, some of what’s in the future, and how we go about doing it: HI at Apple Computer, Inc. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI), (p. 155).
Op den Akker, R, & Bruijnes, M (2012). Computational models of social and emotional turn-taking for embodied conversational agents: a review. COMMIT deliverable. http://doc.utwente.nl/80451/1/Akker12computational.pdf.
Oviatt, S (2008). Multimodal interfaces. In: Jacko, J, & Sears, A (Eds.) In The Human-Computer Interaction Handbook. LNCS 5440. Lawrence Erlbaum and Associates, Mahwah.
Pardo, D, Mencia, BL, Trapote, AH (2010). Non-verbal communication strategies to improve robustness in dialog systems: a comparative study. Journal of Multimodal User Interfaces, 3, 285–297.
Picard, R. (1997). Affective Computing. Cambridge, USA: MIT Press.
Poppe, R, Böck, R, Bonin, F, Campbell, N, de Kok, I, Traum, D (2014). The special issue: From multimodal analysis to real-time interactions with virtual agents. Journal of Multimodal User Interfaces, 8, 1–3.
Preece, J, Rogers, Y, Sharp, H, Benyon, D, Holland, S, Carey, T. (1994). Human-Computer Interaction. Wokingham: Addison-Wesley.
Price, P, Hirschman, L, Shriberg, E, Wade, E (1992). Subject-based evaluation measures for interactive spoken language systems. In Proc. DARPA Workshop, (pp. 281–292).
Rammstedt, B, & John, OP (2007). Measuring personality in one minute or less: A 10-item short version of the big five inventory in English and German. Journal of Research in Personality, 41, 203–212.
Reeves, B, & Nass, C. (1996). The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places: Cambridge University Press.
Richter, T, Naumann, J, Groeben, N (2000). Attitudes toward the computer: Construct validation of an instrument with scales differentiated by content. Computers in Human Behavior, 16, 473–491.
Riether, N, Hegel, F, Wrede, B, Horstmann, G (2012). Social facilitation with social robots? In Proc. ACM/IEEE Int. Conf. on Human-Robot Interaction, (pp. 41–48).
Ring, L, Shi, L, Totzke, K, Bickmore, T (2015). Social support agents for older adults: longitudinal affective computing in the home. Journal of Multimodal User Interfaces, 9, 79–88.
Rogers, C. (1951). Client-Centered Therapy. Cambridge, Massachusetts: The Riverside Press.
Ruttkay, Z, & Op den Akker, R (2004). Affordances and cognitive walkthrough for analyzing human-virtual human interaction. In: Esposito, A, Bourbakis, NG, Avouris, N, Hatzilygeroudis, I (Eds.) In Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction. LNCS 5042. Springer, Heidelberg.
Ruttkay, Z, & Pelachaud, C. (2004). From Brows to Trust: Evaluating Embodied Conversational Agents. Dordrecht: Kluwer.
Ruttkay, Z, Dormann, C, Noot, H (2004). Ecas on a common ground – a framework for design and evaluation. In: Ruttkay, Z, & Pelachaud, C (Eds.) In From Brows to Trust: Evaluating Embodied Conversational Agents. Kluwer, Dordrecht.
Saygin, AP, Chaminade, T, Ishiguro, H, Driver, J, Frith, C (2012). The thing that should not be: Predictive coding and the uncanny valley in perceiving human and humanoid robot actions. Social Cognitive Affective Neuroscience, 7(4), 413–422.
Schacter, DL. (2001). The Seven Sins of Memory: How the Mind Forgets and Remembers. Boston: Houghton Mifflin.
Scherer, K (2004). The functions of non-verbal signs in conversation. In: St. Clair, R, & Giles, H (Eds.) In The Social and Psychological Contexts of Language. Erlbaum, Hillsdale.
Schleicher, R. (2009). Emotionen und Peripherphysiologie: Pabst Science Publishers.
Schleicher, R, & Trösterer, S (2009). Der Joy Of Use Button. In Proc. Mensch and Computer, (pp. 419–422).
Schulz, V, & Thun, F. (1981). Miteinander Reden: Störungen und Klärungen. Psychologie der Zwischenmenschlichen Kommunikation. Reinbek: Rowohlt.
Schwarz, N, & Clore, GL (2003). Mood as information: 20 years later. Psychological Inquiry, 14, 296–303.
Searle, J. (1969). Speech Acts: An Essay in the Philosophy of Language: Cambridge University Press.
Shannon, C, & Weaver, W. (1949). The Mathematical Theory of Communication. Urbana: University of Illinois Press.
Shannon, C, & Weaver, W. (1987). Designing the User Interface: Strategies for Effective Human-Computer Interaction. Reading: Addison-Wesley.
Sheldon, KM, Elliot, AJ, Kim, Y, Kasser, T (2002). What is satisfying about satisfying events? testing 10 candidate psychological needs. Journal of Personality and Social Psychology, 80, 325–339.
Silvia, PJ, & Warburton, JB (2006). Positive and negative affect: Bridging stages and traits. In Comprehensive Handbook of Personality. John Wiley and Sons, Hoboken, (pp. 268–284).
Smith, B, Caputi, P, Rawstorne, PR (2007). The development of a measure of subjective computer experience. Computers in Human Behavior, 23, 127–145.
Sonntag, D, Jacobs, O, Weihrauch, C (2004). Usability guidelines for use case applications. Technical report, Deutsches Forschungsinstitut für Künstliche Intelligenz. Theseus Report CTC WP4, Task 4.1, MS3.
Spool, J, & Schroeder, W (2001). Test web sites: five users is nowhere near enough. In Proc. ACM Conf. on Human Factors in Computing Systems (CHI), (pp. 285–286).
Sproull, L, Subramani, M, Kiesler, S, Walker, JH, Waters, K (1996). When the interface is a face. Human-Computer Interaction, 11, 97–124.
Stein, BE, Stanford, TR, Ramachandran, R, Perrault, TJJ, Rowland, BA (2009). Challenges in quantifying multisensory integration: Alternative criteria, models, and inverse effectiveness. Experimental Brain Research, 198, 113–126.
Stevens, C, Gibert, G, Leung, Y, Zhang, Z (2013). Evaluating a synthetic talking head using a dual task: Modality effects on speech understanding and cognitive load. International Journal in Human-Computer Studies, 71, 440–454.
Sturm, J (2005). On the usability of multimodal interaction for mobile access to information services. PhD thesis, Radboud University Nijmegen.
Takeuchi, A, & Naito, T (1995). Situated facial displays: Towards social interaction. In Proc. Conf. on Human Factors in Computing Systems, (pp. 450–455).
Tractinsky, N, Katz, AS, Ikar, D (2000). What is beautiful is usable. Interacting with Computers, 13, 127–145.
Van Vliet, PJA, Kletke, MG, Chakraborty, G (1994). The measurement of computer literacy - a comparison of self-appraisal and objective tests. International Journal of Human-Computer Studies, 40(5), 835–857.
VDE-ITG-Richtlinie (2011). Messung und Bewertung der Usability von Kommunikationsendeinrichtungen. Technical report, Informationstechnische Gesellschaft im VDE.
Virzi, RA (1992). Refining the test phase of usability evaluation: How many subjects is enough?Human Factors, 34, 457–468.
Vogeley, K, & Bente, G (2010). “Artificial humans”: Psychology and neuroscience perspectives on embodiment and nonverbal communication. Neural Networks, 23(8–9), 1077–1090.
Walker, MA, Litman, D, Kamm, CA, Abella, A (1997). PARADISE: A general framework for evaluating spoken dialogue agents. In Proc. of the 35th Annual Meeting of the Association of Computational Linguistics, (pp. 271–280).
Wechsler, D. (2008). Adult Intelligence Scale (WAIS-IV), 4th edn: Pearson.
Wechsung, I, Weiss, B, Ehrenbrink, P (2013). Development and validation of the conversational agents scale (CAS). In Proc. Interspeech, Lyon, (pp. 1106–1110).
Wechsung, I, P., E, Schleicher, R, Möller, S (2012a). Investigating the social facilitation effect in human-robot-interaction. In Proc. Int. Wksh on Spoken Dialogue Systems Technology (IWSDS), (pp. 1–10).
Wechsung, I, Engelbrecht, KP, Kühnel, C, Möller, S, Weiss, B (2012b). Measuring the quality of service and quality of experience of multimodal human-machine interaction. Journal on Multimodal User Interfaces, 6(1), 73–85.
Wechsung, I, Jepsen, K, Burkhardt, F, Köhler, A, Schleicher, R (2012c). View from a distance: comparing online and retrospective UX-evaluations. In Proc. Int. Conf on Human-Computer Interaction with Mobile Devices and Services Companion (MobileHCI), (pp. 113–118).
Weiss, B, Willkomm, S, Möller (2013). Evaluating an adaptive dialog system for the public. In Proc. Interspeech, (pp. 2034–2038).
Weiss, B, Wechsung, I, Marquardt, S (2012). Assessing ict user groups. In Proc. ACM NordiCHI, (pp. 1–9).
Weiss, B, Kühnel, C, Wechsung, I, Fagel, S, Möller, S (2010). Quality of talking heads in different interaction and media contexts. Speech Communication, 52(6), 481–492.
Wharton, C, Rieman, J, Lewis, C, Polson, P (1994). The cognitive walkthrough method: A practitioner’s guide. In: Nielsen, J, & Mack, RL (Eds.) In Usability Inspection Methods. John Wiley and Sons, New York, (pp. 105–140).
Wolters, M, Engelbrecht, KP, Gödde, F, Möller, S, Naumann, A, Schleicher, R (2010). Making it easier for older people to talk to smart homes: Using help prompts to shape users’ speech. Universal Access in the Information Society, 9(4), 311–325.
Woolrych, A, & Cockton, G (2001). Why and when five test users aren’t enough. In Proc. of HCI, (pp. 105–108).
Yee, N, Bailenson, JN, Rickertsen, K (2007). A meta-analysis of the impact of the inclusion and realism of human-like faces on user experiences in interfaces. In Proc. Conf. on Human Factors in Computing Systems, (pp. 1–10).