Cảm giác khi trở thành một robot là như thế nào? Hệ thống telepresence hiện thân với góc nhìn biến đổi để thu thập chuyển động của robot

Personal Technologies - Tập 27 - Trang 299-315 - 2022
Michael Suguitan1, Guy Hoffman1
1Cornell University, Ithaca, USA

Tóm tắt

Chuyển động và hiện thân là những khả năng giao tiếp trung tâm cho robot xã hội, nhưng việc thiết kế các chuyển động hiện thân cho robot thường đòi hỏi kiến thức sâu sắc về cả robot và lý thuyết chuyển động. Các phương pháp dễ tiếp cận hơn như học từ sự chứng minh thường dựa vào việc tiếp cận thể chất với robot, điều này thường bị giới hạn trong các thiết lập nghiên cứu. Các thuật toán học máy (ML) có thể bổ sung cho các chuyển động được thiết kế bằng tay hoặc học được thông qua việc tạo ra các hành vi mới, nhưng điều này yêu cầu tập dữ liệu huấn luyện lớn và đa dạng, điều mà không dễ dàng có được. Trong công trình này, chúng tôi đề xuất một hệ thống telepresence hiện thân để thu thập từ xa các mẫu chuyển động cảm xúc của robot, có thể được sử dụng làm dữ liệu huấn luyện cho ML. Người dùng từ xa điều khiển robot thông qua internet, sử dụng cảm biến chuyển động trong điện thoại thông minh của họ và xem chuyển động từ góc nhìn của người thứ nhất hoặc người thứ ba. Chúng tôi đã đánh giá hệ thống trong một nghiên cứu trực tuyến, nơi người dùng tạo ra các chuyển động cảm xúc cho robot và đánh giá trải nghiệm của họ. Chúng tôi sau đó đã sử dụng các chuyển động do người dùng tạo ra làm đầu vào cho một mạng nơ-ron để tạo ra các chuyển động mới. Chúng tôi thấy rằng người dùng rất ưa thích góc nhìn người thứ ba và rằng các chuyển động được tạo ra bởi ML chủ yếu tương đương với các chuyển động do người dùng tạo ra. Công trình này hỗ trợ tính khả dụng của robot telepresence như một nền tảng thu thập chuyển động.

Từ khóa

#robot xã hội #học máy #telepresence #thu thập chuyển động #chuyển động cảm xúc

Tài liệu tham khảo

Adalgeirsson S, Breazeal C (2010) MeBot: A robotic platform for socially embodied telepresence. In: 2010 5th ACM/IEEE International conference on human-robot interaction (HRI). pp 15–22. https://doi.org/10.1109/HRI.2010.5453272 Adams A, Mahmoud M, Baltrušaitis T, Robinson P (2015) Decoupling facial expressions and head motions in complex emotions. In: 2015 International conference on affective computing and intelligent interaction (ACII). pp 274–280. https://doi.org/10.1109/ACII.2015.7344583 Ainasoja AE, Pertuz S, Kämäräinen J-K (2019) Smartphone teleoperation for self-balancing telepresence robots. In: VISIGRAPP (4: VISAPP). pp 561–568 https://doi.org/10.5220/0007406405610568 Argall BD, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robotics and Autonomous Systems 57(5):469–483. https://doi.org/10.1016/j.robot.2008.10.024 Breazeal C, DePalma N, Orkin J, Chernova S, Jung M (2013) Crowdsourcing human-robot interaction: new methods and system evaluation in a public environment. J Hum-Robot Interact. 2(1):82–111. https://doi.org/10.5898/JHRI.2.1.Breazeal Cao Z, Hidalgo G, Simon T, Wei S-E, Sheikh Y (2019) OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(1):172–186. https://doi.org/10.1109/TPAMI.2019.2929257 Denisova A, Cairns P (2015) First person vs. third person perspective in digital games: do player preferences affect immersion?. In: Proceedings of the 33rd annual acm conference on human factors in computing systems. (CHI ’15). association for computing machinery, New York, NY, USA, 145–148. https://doi.org/10.1145/2702123.2702256 Desai R, Anderson F, Matejka J, Coros S, McCann J, Fitzmaurice G, Grossman T (2019) Geppetto: enabling semantic design of expressive robot behaviors. In: Proceedings of the 2019 CHI conference on human factors in computing systems. (CHI ’19). ACM, New York, NY, USA, Article 369, 14 pages. https://doi.org/10.1145/3290605.3300599 Duffy BR, Colm Rooney GMP, O’Hare, GMP, O’Donoghue, R (1999) What is a social robot? Debarba HG, Bovet S, Salomon R, Blanke O, Herbelin B, Boulic R (2017) Characterizing first and third person viewpoints and their alternation for embodied interaction in virtual reality. PLOS ONE 12(12):1–19. https://doi.org/10.1371/journal.pone.0190109 Goldberg K (2001) The robot in the garden: telerobotics and telepistemology in the age of the internet. MIT Press Gomez R, Szapiro D, Merino L, Brock H, Nakamura K, Sabanovic S (2020) Emoji to robomoji: exploring affective telepresence through haru. In: International Conference on Social Robotics. Springer, pp 652–663. https://doi.org/10.1007/978-3-030-62056-1_54 Gorisse G, Christmann O, Amato EA, Richir S (2017) First- and third-person perspectives in immersive virtual environments: presence and performance analysis of embodied users. Frontiers in Robotics and AI. 4:33. https://doi.org/10.3389/frobt.2017.00033 Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A (2017) Beta-vae: learning basic visual concepts with a constrained variational framework. Iclr 2(5):6 Hoffman G, Ju W (2014) Designing robots with movement in mind. Journal of Human-Robot Interaction 3(1):89–122 Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Bach F, Blei D (eds) Proceedings of the 32nd international conference on machine learning (proceedings of machine learning research), vol 37. PMLR, Lille, France, 448–456. http://proceedings.mlr.press/v37/ioffe15.html Jonggil A, Kim GJ (2018) SPRinT: A mixed approach to a hand-held robot interface for telepresence. International Journal of Social Robotics 10(4):537–552. https://doi.org/10.1007/s12369-017-0463-2 Kilteni K, Groten R, Slater M (2012) The sense of embodiment in virtual reality. Presence: Teleoperators and Virtual Environments 21(4):373–387. https://doi.org/10.1162/PRES_a_00124 Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv:1312.6114 [stat.ML] Komiyama R, Miyaki T, Rekimoto J (2017) JackIn space: designing a seamless transition between first and third person view for effective telepresence collaborations. In: Proceedings of the 8th augmented human international conference (AH ’17). association for computing machinery, New York, NY, USA, Article 14. https://doi.org/10.1145/3041164.3041183 Kucherenko T, Jonell P, Yoon Y, Wolfert P, Henter GE (2021) A large, crowdsourced evaluation of gesture generation systems on common data: the GENEA challenge 2020. In: International conference on intelligent user interfaces. https://doi.org/10.1145/3397481.3450692 Mandlekar A, Zhu Y, Garg A, Booher J, Spero M, Tung A, Gao J, Emmons J, Gupta A, Orbay E, Savarese S, Fei-Fei L (2018) RoboTurk: A crowdsourcing platform for robotic skill learning through imitation. In: proceedings of the 2nd conference on robot learning (proceedings of machine learning research), vol 87. PMLR, pp 879-893. http://proceedings.mlr.press/v87/mandlekar18a.html Marmpena M, Lim A, Dahl TS, Hemion N(2019) Generating robotic emotional body language with variational autoencoders. In: 2019 8th international conference on affective computing and intelligent interaction (ACII). IEEE, pp 545–551 Müller J, Langlotz T, Regenbrecht H (2016) PanoVC: Pervasive telepresence using mobile phones. In: 2016 IEEE international conference on pervasive computing and communications. pp 1–10. https://doi.org/10.1109/PERCOM.2016.7456508 Pavllo D, Feichtenhofer C, Grangier D, Auli M (2019) 3D Human pose estimation in video with temporal convolutions and semi-supervised training. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) Raaen K, Kjellmo I (2015) Measuring latency in virtual reality systems. Entertainment computing - ICEC 2015. Springer International Publishing, Cham, pp 457–462 Rakita D, Mutlu B, Gleicher M (2020) Effects of onset latency and robot speed delays on mimicry-control teleoperation. In: Proceedings of the 2020 ACM/IEEE international conference on human-robot interaction (HRI ’20). Association for computing machinery, New York, NY, USA, 519–527. https://doi.org/10.1145/3319502.3374838 Rhodin H, Tompkin J, Kim KI, Varanasi K, Seidel H-P (2014) Theobalt C (2014) Interactive motion mapping for real-time character control. Computer Graphics Forum 33(2):273–282. https://doi.org/10.1111/cgf.12325 Russell JA (1980) A circumplex model of affect. Journal of Personality and Social Psychology 39(6):1161 Sakashita M, Minagawa T, Koike A, Suzuki I, Kawahara K, Ochiai Y (2017) You as a puppet: evaluation of telepresence user interface for puppetry. In: Proceedings of the 30th annual ACM symposium on user interface software and technology (UIST ’17). Association for Computing Machinery, New York, NY, USA, 217–228. https://doi.org/10.1145/3126594.3126608 Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61(2015):85–117. https://doi.org/10.1016/j.neunet.2014.09.003 Seol Y, O’Sullivan C, Lee J (2013) Creature features: online motion puppetry for non-human characters. In: Proceedings of the 12th ACM SIGGRAPH/eurographics symposium on computer animation (SCA ’13). ACM, New York, NY, USA, 213–221. https://doi.org/10.1145/2485895.2485903 Sirkin D, Ju W (2012) Consistency in physical and on-screen action improves perceptions of telepresence robots. In: Proceedings of the seventh annual ACM/IEEE international conference on human-robot interaction (HRI ’12). Association for Computing Machinery, New York, NY, USA, 57–64. https://doi.org/10.1145/2157689.2157699 Slyper R, Hoffman G, Shamir A (2015) Mirror puppeteering: animating toy robots in front of a webcam. In: Proceedings of the ninth international conference on tangible, embedded, and embodied interaction. ACM, pp. 241–248 Stiehl WD, Lee JK, Breazeal C, Nalin M, Morandi A, Sanna A (2009) The huggable: a platform for research in robotic companions for pediatric care. In: Proceedings of the 8th international conference on interaction design and children (IDC ’09). Association for Computing Machinery, New York, NY, USA, 317–320. https://doi.org/10.1145/1551788.1551872 Strong R, Gaver B (1996) Feather, scent and shaker: supporting simple intimacy. Proceedings of CSCW 96:29–30 Suguitan M, Gomez R, Hoffman G (2020) MoveAE: modifying affective robot movements using classifying variational autoencoders. In: Proceedings of the 2020 ACM/IEEE international conference on human-robot interaction. pp 481–489. https://doi.org/10.1145/3319502.3374807 Suguitan M, Hoffman G (2019) Blossom: a handcrafted open-source robot. ACM Transactions on Human-Robot Interaction 8(1) Article 2 (March 2019), 27. https://doi.org/10.1145/3310356 Taheri S, Beni LA, Veidenbaum AV, Nicolau A, Cammarota R, Qiu J, Lu Q, Haghighat MR (2015) WebRTCbench: a benchmark for performance assessment of webRTC implementations. In: 2015 13th IEEE Symposium on embedded systems for real-time multimedia (ESTIMedia). pp 1–7. https://doi.org/10.1109/ESTIMedia.2015.7351769 Tanaka K, Nakanishi H, Ishiguro H (2014) Comparing video, avatar, and robot mediated communication: pros and cons of embodiment. In: Yuizono T, Zurita G, Baloian N, Inoue T, Ogata H (eds) Collaboration technologies and social computing. Springer, Berlin, pp 96–110 Tang A, Fakourfar O, Neustaedter C, Bateman S (2017) Collaboration in \(360^{\circ }\) videochat: challenges and opportunities. https://doi.org/10.11575/PRISM/10182 Tsoi N, Connolly J, Adéníran E, Hansen A, Pineda KT, Adamson T, Thompson S, Ramnauth R, Vázquez M, Scassellati B (2021) Challenges deploying robots during a pandemic: an effort to fight social isolation among children. In: Proceedings of the 2021 ACM/IEEE international conference on human-robot interaction (HRI ’21). Association for Computing Machinery, New York, NY, USA, 234-242. https://doi.org/10.1145/3434073.3444665 Yamane K, Ariki Y, Hodgins J (2010) Animating Non-Humanoid Characters With Human Motion Data. In: Proceedings of the 2010 ACM SIGGRAPH/eurographics symposium on computer animation (SCA ’10). Eurographics Association, Goslar Germany, Germany, 169–178. http://dl.acm.org/citation.cfm?id=1921427.1921453 Yoon Y, Cha B, Lee J-H, Jang M, Lee J, Kim J, Lee G (2020) Speech gesture generation from the trimodal context of text, audio, and speaker identity. ACM Transactions on Graphics 39(6) Article 222. https://doi.org/10.1145/3414685.3417838 Young J, Langlotz T, Cook M, Mills S, Regenbrecht H (2019) Immersive telepresence and remote collaboration using mobile and wearable devices. IEEE Transactions on Visualization and Computer Graphics 25(5):1908–1918. https://doi.org/10.1109/TVCG.2019.2898737 Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) Mixup: beyond empirical risk minimization. arXiv:1710.09412 [cs.LG]