iSocioBot: A Multimodal Interactive Social Robot

Springer Science and Business Media LLC - Tập 10 - Trang 5-19 - 2017
Zheng-Hua Tan1, Nicolai Bæk Thomsen1, Xiaodong Duan1, Evgenios Vlachos1, Sven Ewan Shepstone2, Morten Højfeldt Rasmussen1, Jesper Lisby Højvang3
1Department of Electronic Systems, Aalborg University, Aalborg, Denmark
2Bang and Olufsen A/S, Struer, Denmark
3MV-Nordic, Odense, Denmark

Tóm tắt

We present one way of constructing a social robot, such that it is able to interact with humans using multiple modalities. The robotic system is able to direct attention towards the dominant speaker using sound source localization and face detection, it is capable of identifying persons using face recognition and speaker identification and the system is able to communicate and engage in a dialog with humans by using speech recognition, speech synthesis and different facial expressions. The software is built upon the open-source robot operating system framework and our software is made publicly available. Furthermore, the electrical parts (sensors, laptop, base platform, etc.) are standard components, thus allowing for replicating the system. The design of the robot is unique and we justify why this design is suitable for our robot and the intended use. By making software, hardware and design accessible to everyone, we make research in social robotics available to a broader audience. To evaluate the properties and the appearance of the robot we invited users to interact with it in pairs (active interaction partner/observer) and collected their responses via an extended version of the Godspeed Questionnaire. Results suggest an overall positive impression of the robot and interaction experience, as well as significant differences in responses based on type of interaction and gender.

Tài liệu tham khảo

Ahn HS, Lee DW, Choi D, Lee DY, Hur M, Lee H (2012) Difference of efficiency in human–robot interaction according to condition of experimental environment. In: Social robotics. Springer, Berlin, pp 219–227 Baddoura R, Venture G (2013) Social vs. useful HRI: experiencing the familiar, perceiving the robot as a sociable partner and responding to its actions. Int J Soc Robot 5(4):529–547 Bartneck C, Kulić D, Croft E, Zoghbi S (2009) Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. Int J Soc Robot 1(1):71–81 Boersma P (1993) Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. IFA Proc 17:97–110 Brahnam S, Jain LC, Nanni L, Lumini A (2014) Local binary patterns: new variants and applications, vol 506. Springer, Berlin Breazeal C (2004) Social interactions in HRI: the robot view. IEEE Trans Syst Man Cybern C Appl Rev 34(2):181–186 Broekens J, Heerink M, Rosendal H (2009) Assistive social robots in elderly care: a review. Gerontechnology 8(2):94–103 Chang WL, Šabanović S (2015) Interaction expands function: social shaping of the therapeutic robot paro in a nursing home. In: Proceedings of the tenth annual ACM/IEEE international conference on human–robot interaction. ACM, pp 343–350 Cialdini RB (1993) Influence: the psychology of persuasion. Morrow, New York Cooney MD, Kanda T, Alissandrakis A, Ishiguro H (2011) Interaction design for an enjoyable play interaction with a small humanoid robot. In: 2011 11th IEEE-RAS international conference on humanoid robots (humanoids). IEEE, pp 112–119 Dehak N, Kenny PJ, Dehak R, Dumouchel P, Ouellet P (2011) Front-end factor analysis for speaker verification. IEEE Trans Audio Speech Lang Process 19(4):788–798 Dibiase JH (2000) A high-accuracy, low-latency technique for talker localization in reverberant environments using microphone arrays. Ph.D. thesis, Brown University Duan X, Tan ZH (2015) Local feature learning for face recognition under varying poses. In: 2015 IEEE international conference on image processing (ICIP). IEEE, pp 2905–2909 Ekman P (1992) An argument for basic emotions. Cognit Emot 6(3–4):169–200 Eyssel F, Kuchenbrandt D (2012) Social categorization of social robots: anthropomorphism as a function of robot group membership. Br J Soc Psychol 51(4):724–731 Feil-Seifer D, Matarić MJ (2005) Defining socially assistive robotics. In: 9th International conference on rehabilitation robotics, 2005 (ICORR 2005). IEEE, pp 465–468 Fortenberry B, Chenu J, Movellan J (2004) Rubi: a robotic platform for real-time social interaction. In: Proceedings of the international conference on development and learning (ICDL04), The Salk Institute, San Diego Franois D, Powell S, Dautenhahn K (2009) A long-term study of children with autism playing with a robotic pet: taking inspirations from non-directive play therapy to encourage children’s proactivity and initiative-taking. Interact Stud 10(3):324–373 Garage W Robot operating system (ros). http://www.ros.org/. Accessed 29 Sept 2014 Garofolo JS, Consortium LD, et al (1993) TIMIT: acoustic-phonetic continuous speech corpus. Linguistic Data Consortium Gass RH, Seiter JS (2015) Persuasion: social influence and compliance gaining. Routledge, London Haring KS, Silvera-Tawil D, Takahashi T, Velonaki M, Watanabe K (2015) Perception of a humanoid robot: a cross-cultural comparison. In: 2015 24th IEEE international symposium on robot and human interactive communication (RO-MAN). IEEE, pp 821–826 Hashimoto T, Hitramatsu S, Tsuji T, Kobayashi H (2006) Development of the face robot saya for rich facial expressions. In: International joint conference on SICE-ICASE, 2006. IEEE, pp 5423–5428 Ho CC, MacDorman KF (2010) Revisiting the uncanny valley theory: developing and validating an alternative to the godspeed indices. Comput Hum Behav 26(6):1508–1518 Jochum E, Vlachos E, Christoffersen A, Nielsen SG, Hameed IA, Tan ZH (2016) Using theatre to study interaction with care robots. Int J Soc Robot 1–14 Kanda T, Ishiguro H, Imai M, Ono T (2004) Development and evaluation of interactive humanoid robots. Proc IEEE 92(11):1839–1850 Kenny P, Boulianne G, Dumouchel P (2005) Eigenvoice modeling with sparse training data. IEEE Trans Speech Audio Process 13(3):345–354 Kim ES, Berkovits LD, Bernier EP, Leyzberg D, Shic F, Paul R, Scassellati B (2013) Social robots as embedded reinforcers of social behavior in children with autism. J Autism Dev Disord 43(5):1038–1049 Kober J, Peters J (2012) Reinforcement learning in robotics: a survey. In: Reinforcement learning. Springer, Berlin, pp 579–610 Larcher A, Bonastre JF, Fauve B, Lee KA, Lévy C, Li H, Mason J, Parfait JY, U ValidSoft Ltd (2013) ALIZE 3.0-open source toolkit for state-of-the-art speaker recognition. In: Annual conference of the international speech communication association (Interspeech), pp 1–5 Manohar V, Crandall J (2014) Programming robots to express emotions: interaction paradigms, communication modalities, and context. IEEE Trans Hum–Mach Syst 44(3):362–373 Mazzei D, Lazzeri N, Hanson D, De Rossi D (2012) Hefes: an hybrid engine for facial expressions synthesis to control human-like androids and avatars. In: 2012 4th IEEE RAS & EMBS international conference on Biomedical Robotics and Biomechatronics (BioRob). IEEE, pp 195–200 Metta G, Sandini G, Vernon D, Natale L, Nori F (2008) The iCub humanoid robot: an open platform for research in embodied cognition. In: Proceedings of the 8th workshop on performance metrics for intelligent systems. ACM, pp 50–56 Michalowski MP, Sabanovic S, Simmons R (2006) A spatial model of engagement for a social robot. In: 9th IEEE international workshop on advanced motion control, 2006. IEEE, pp 762–767 Ojansivu V, Heikkilä J (2008) Blur insensitive texture classification using local phase quantization. In: Image and signal processing. Springer, Berlin, pp 236–243 Okuno HG, Nakadai K, Hidai Ki, Mizoguchi H, Kitano H (2001) Human–robot interaction through real-time auditory and visual multiple-talker tracking. In: Proceedings of the 2001 IEEE/RSJ international conference on intelligent robots and systems, 2001, vol 3. IEEE, pp 1402–1409 Okuno HG, Nakadai K, Kitano H (2002) Social interaction of humanoid robot based on audio-visual tracking. In: Developments in applied artificial intelligence. Springer, Berlin, pp. 725–735 Python api for google speech recognition. http://pypi.python.org/pypi/SpeechRecognition/2.1.3. Accessed 15 July 2016 Pereira FG, Vassallo RF, Salles EOT (2013) Human–robot interaction and cooperation through people detection and gesture recognition. J Control Autom Electr Syst 24(3):187–198 Pietikäinen M, Hadid A, Zhao G, Ahonen T (2011) Computer vision using local binary patterns, vol 40. Springer, Berlin Reich-Stiebert N, Eyssel F (2015) Learning with educational companion robots? Toward attitudes on education robots, predictors of attitudes, and application potentials for education robots. Int J Soc Robot 7(5):875–888 Robinson H, MacDonald B, Broadbent E (2014) The role of healthcare robots for older people at home: a review. Int J Soc Robot 6(4):575–591 Salichs MA, Barber R, Khamis AM, Malfaz M, Gorostiza JF, Pacheco R, Rivas R, Corrales A, Delgado E, García D (2006) Maggie: a robotic platform for human–robot social interaction. In: 2006 IEEE conference on robotics, automation and mechatronics. IEEE, pp 1–7 Siegel M, Breazeal C, Norton MI (2009) Persuasive robotics: the influence of robot gender on human behavior. In: IEEE/RSJ international conference on : intelligent robots and systems, 2009 (IROS 2009). IEEE, pp 2563–2568 Stiefelhagen R, Ekenel HK, Fugen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Voit M, Waibel A (2007) Enabling multimodal human–robot interaction for the karlsruhe humanoid robot. IEEE Trans Robot 23(5):840–851 Stiefelhagen R, Fugen C, Gieselmann R, Holzapfel H, Nickel K, Waibel A (2004) Natural human–robot interaction using speech, head pose and gestures. In: Proceedings of the 2004 IEEE/RSJ international conference on intelligent robots and systems, 2004 (IROS 2004), vol. 3. IEEE, pp 2422–2427 Stiefelhagen R, Zhu J (2002) Head orientation and gaze direction in meetings. In: CHI’02 extended abstracts on human factors in computing systems. ACM, pp 858–859 Tahir Y, Rasheed U, Dauwels S, Dauwels J (2014) Perception of humanoid social mediator in two-person dialogs. In: Proceedings of the 2014 ACM/IEEE international conference on human–robot interaction. ACM, pp 300–301 Tan ZH, Lindberg B (2010) Low-complexity variable frame rate analysis for speech recognition and voice activity detection. IEEE J Sel Top Signal Process 4(5):798–807. doi:10.1109/JSTSP.2010.2057192 Tan ZH, Thomsen NB, Duan X (2015) Designing and implementing an interactive social robot from off-the-shelf components. In: Recent advances in mechanism design for robotics. Springer, Berlin, pp 113–121 Thomsen NB, Tan ZH, Lindberg B, Jensen SH (2014) Improving robustness against environmental sounds for directing attention of social robots. In: Multimodal analyses enabling artificial agents in human–machine interaction. Springer, Berlin, pp 25–34 Thomsen NB, Tan ZH, Lindberg B, Jensen SH (2015) Learning direction of attention for a social robot in noisy environments. In: 3rd AAU workshop on robotics. Aalborg Universitetsforlag Viola P, Jones M (2001) Robust real-time object detection. In: International journal of computer vision Vlachos E, Jochum EA, Schärfe H (2016) Head orientation behavior of users and durations in playful open-ended interactions with an android robot. In: Cultural robotics, LNAI 9549. Springer, Berlin Vlachos E, Schärfe H (2012) Android emotions revealed. In: Social robotics, pp 56–65. Springer, Berlin Vlachos E, Scharfe H (2015) An open-ended approach to evaluating android faces. In: 2015 24th IEEE international symposium on robot and human interactive communication (RO-MAN). IEEE, pp 746–751