Cải thiện và mở rộng khả năng học trực tuyến về khái niệm không gian và mô hình ngôn ngữ với bản đồ

Autonomous Robots - Tập 44 - Trang 927-946 - 2020
Akira Taniguchi1, Yoshinobu Hagiwara1, Tadahiro Taniguchi1, Tetsunari Inamura2
1Ritsumeikan University, Kusatsu, Japan
2The National Institute of Informatics / SOKENDAI (The Graduate University for Advanced Studies), Tokyo, Japan

Tóm tắt

Chúng tôi đề xuất một thuật toán học trực tuyến mới, gọi là SpCoSLAM 2.0, nhằm mục đích tiếp thu khái niệm không gian và ngôn ngữ với độ chính xác cao và khả năng mở rộng tốt. Trước đây, chúng tôi đã đề xuất SpCoSLAM như một thuật toán học trực tuyến dựa trên mô hình xác suất Bayes không giám sát, tích hợp phân loại địa điểm đa phương thức, tiếp thu từ vựng và SLAM. Tuy nhiên, thuật toán ban đầu có độ chính xác ước lượng hạn chế do ảnh hưởng của các giai đoạn đầu của quá trình học và độ phức tạp tính toán tăng lên khi có thêm dữ liệu đào tạo. Do đó, chúng tôi giới thiệu các kỹ thuật như hồi phục độ trễ cố định để giảm thời gian tính toán trong khi vẫn duy trì độ chính xác cao hơn thuật toán gốc. Kết quả cho thấy, về khía cạnh độ chính xác ước lượng, thuật toán đề xuất vượt trội hơn so với thuật toán gốc và có thể so sánh với học theo lô. Ngoài ra, thời gian tính toán của thuật toán đề xuất không phụ thuộc vào số lượng dữ liệu đào tạo và trở nên cố định cho mỗi bước của thuật toán có khả năng mở rộng. Cách tiếp cận của chúng tôi sẽ đóng góp vào việc hiện thực hóa các tương tác ngôn ngữ không gian lâu dài giữa con người và robot.

Từ khóa

#học trực tuyến #khái niệm không gian #mô hình ngôn ngữ #SLAM #tiếp thu từ vựng #khả năng mở rộng

Tài liệu tham khảo

Aldous, D. (1985). Exchangeability and related topics. École d’Été de Probabilités de Saint-Flour XIII-1983 (pp. 1–198). Aoki, T., Nishihara, J., Nakamura, T., & Nagai, T. (2016). Online joint learning of object concepts and language model using multimodal hierarchical Dirichlet process. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 2636–2642). IEEE Araki, T., Nakamura, T., Nagai, T., Funakoshi, K., Nakano, M., & Iwahashi, N. (2012a). Online object categorization using multimodal information autonomously acquired by a mobile robot. Advanced Robotics, 26(17), 1995–2020. Araki, T., Nakamura, T., Nagai, T., Nagasaka, S., Taniguchi, T., & Iwahashi, N. (2012b). Online learning of concepts and words using multimodal LDA and hierarchical Pitman-Yor Language Model. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1623–1630). IEEE Ball, D., Heath, S., Wiles, J., Wyeth, G., Corke, P., & Milford, M. (2013). OpenRatSLAM: an open source brain-based slam system. Autonomous Robots, 34(3), 149–176. Beevers, K. R., & Huang, W. H. (2007). Fixed-lag sampling strategies for particle filtering slam. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 2433–2438). IEEE Börschinger, B., & Johnson, M. (2011). A particle filter algorithm for Bayesian wordsegmentation. In Australasian language technology association workshop 2011 (p. 10). Citeseer Börschinger, B., & Johnson, M. (2012). Using rejuvenation to improve particle filtering for Bayesian word segmentation. In Proceedings of the 50th annual meeting of the association for computational linguistics, association for computational linguistics (pp. 85–89). Cangelosi, A., & Schlesinger, M. (2015). Developmental robotics: From babies to robots. intelligent robotics and autonomous agents series. MIT Press. https://books.google.co.jp/books?id=AbKPoAEACAAJ. Canini, K. R., Shi, L., & Griffiths, T. L. (2009). Online inference of topics with latent Dirichlet allocation. Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 9, 65–72. Doucet, A., De Freitas, N., Murphy, K., & Russell, S. (2000). Rao-blackwellised particle filtering for dynamic bayesian networks. In Proceedings of the 16th conference on uncertainty in artificial intelligence (pp. 176–183). Morgan Kaufmann Publishers Inc. Fox, E. B., Sudderth, E. B., Jordan, M. I., & Willsky, A. S. (2011). A sticky HDP-HMM with application to speaker diarization. The Annals of Applied Statistics, 5(2A), 1020–1056. Grisetti, G., Stachniss, C., & Burgard, W. (2007). Improved techniques for grid mapping with Rao-Blackwellized particle filters. IEEE Transactions on Robotics, 23, 34–46. Gu, Z., Taguchi, R., Hattori, K., Hoguro, M., & Umezaki, T. (2016). Learning of relative spatial concepts from ambiguous instructions. In Proceedings of the 13th IFAC/IFIP/IFORS/IEA symposium on analysis, design, and evaluation of human-machine systems (IFAC HMS) (Vol. 49, pp. 150–153). Elsevier Hagiwara, Y., Inoue, M., Kobayashi, H., & Taniguchi, T. (2018). Hierarchical spatial concept formation based on multimodal information for human support robots. Frontiers in Neurorobotics, 12, 11. https://doi.org/10.3389/fnbot.2018.00011. Han, F., Wang, H., Huang, G., & Zhang, H. (2018). Sequence-based sparse optimization methods for long-term loop closure detection in visual slam. Autonomous Robots, 42(7), 1323–1335. https://doi.org/10.1007/s10514-018-9736-3. Heath, S., Ball, D., & Wiles, J. (2016). Lingodroids: Cross-situational learning for episodic elements. IEEE Transactions on Cognitive and Developmental Systems, 8(1), 3–14. https://doi.org/10.1109/TAMD.2015.2442619. Hemachandra, S., Walter, M. R., Tellex, S., & Teller, S. (2014). Learning spatial-semantic representations from natural language descriptions and scene classifications. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 2623–2630). IEEE Howard, A., & Roy, N. (2003). The robotics data set repository (radish). http://radish.sourceforge.net/. Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218. Inamura, T., Shibata, T., Sena, H., Hashimoto, T., Kawai, N., Miyashita, T., Sakurai, Y., Shimizu, M., Otake, M., Hosoda, K., et al. (2010). Simulator platform that enables social interaction simulation—SIGVerse: SocioIntelliGenesis simulator. In: Proceedings of the IEEE/SICE international symposium on system integration (pp. 212–217). Isobe, S., Taniguchi, A., Hagiwara, Y., & Taniguchi, T. (2017). Learning relationships between objects and places by multimodal spatial concept with bag of objects. In Proceedings of the international conference on social robotics (ICSR) (pp. 115–125). Springer Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093. Kantas, N., Doucet, A., Singh, S. S., Maciejowski, J., Chopin, N., et al. (2015). On particle methods for parameter estimation in state-space models. Statistical Science, 30(3), 328–351. Karaoğuz, H., & Bozma, H. I. (2016). An integrated model of autonomous topological spatial cognition. Autonomous Robots, 40(8), 1379–1402. https://doi.org/10.1007/s10514-015-9514-4. Kitagawa, G. (2014). Computational aspects of sequential Monte Carlo filter and smoother. Annals of the Institute of Statistical Mathematics, 66(3), 443–471. Kostavelis, I., & Gasteratos, A. (2015). Semantic mapping for mobile robotics tasks: A survey. Robotics and Autonomous Systems, 66, 86–103. Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. In Proceedings of the advances in neural information processing systems (NIPS), Nevada, United States (pp. 1097–1105). Kudo, T. (2006). MeCab: Yet another part-of-speech and morphological analyzer. https://github.com/taku910/mecab. Landsiedel, C., Rieser, V., Walter, M., & Wollherr, D. (2017). A review of spatial reasoning and interaction for real-world robotics. Advanced Robotics, 31(5), 222–242. Lee, A., & Kawahara, T. (2009). Recent development of open-source speech recognition engine Julius. In Proceedings of the APSIPA ASC (pp. 131–137). Luperto, M., & Amigoni, F. (2018). Predicting the global structure of indoor environments: A constructive machine learning approach. Autonomous Robots. https://doi.org/10.1007/s10514-018-9732-7. Mochihashi, D., Yamada, T., & Ueda, N. (2009). Bayesian unsupervised word segmentation with nested Pitman-Yor language modeling. In Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP (ACL-IJCNLP) (pp. 100–108). Montemerlo, M., Thrun, S., Koller, D., Wegbreit, B., et al. (2003). FastSLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In Proceedings of the international joint conference on artificial intelligence (IJCAI) (pp. 1151–1156). Nakamura, T., Nagai, T., & Taniguchi, T. (2018). Serket: An architecture for connecting stochastic models to realize a large-scale cognitive model. Frontiers in Neurorobotics, 12, 25. https://doi.org/10.3389/fnbot.2018.00025. Neubig, G., Mimura, M., & Kawahara, T. (2012). Bayesian learning of a language model from continuous speech. IEICE Transactions on Information and Systems, 95(2), 614–625. Nishihara, J., Nakamura, T., & Nagai, T. (2017). Online algorithm for robots to learn object concepts and language model. IEEE Transactions on Cognitive and Developmental Systems, 9(3), 255–268. https://doi.org/10.1109/TCDS.2016.2552579. Pronobis, A., & Jensfelt, P. (2012). Large-scale semantic mapping and reasoning with heterogeneous modalities. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 3515–3522). IEEE Rangel, J. C., Cazorla, M., García-Varea, I., Romero-González, C., & Martínez-Gómez, J. (2018). Automatic semantic maps generation from lexical annotations. Autonomous Robots. https://doi.org/10.1007/s10514-018-9723-8. Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 4, 639–650. Sünderhauf, N., Dayoub, F., McMahon, S., Talbot, B., Schulz, R., Corke, P., Wyeth, G., Upcroft, B., & Milford, M. (2016). Place categorization and semantic mapping on a mobile robot. In Proceedings of the IEEE international conference on robotics and automation (ICRA) (pp. 5729–5736). IEEE Taguchi, R., Yamada, Y., Hattori, K., Umezaki, T., Hoguro, M., Iwahashi, N., Funakoshi, K., & Nakano, M. (2011). Learning place-names from spoken utterances and localization results by mobile robot. In Proceedings of the annual conference of the international speech communication association (INTERSPEECH) (pp. 1325–1328). Taniguchi, A., Taniguchi, T., & Inamura, T. (2016). Spatial concept acquisition for a mobile robot that integrates self-localization and unsupervised word discovery from spoken sentences. IEEE Transactions on Cognitive and Developmental Systems, 8(4), 285–297. https://doi.org/10.1109/TCDS.2016.2565542. Taniguchi, A., Hagiwara, Y., Taniguchi, T., & Inamura, T. (2017). Online spatial concept and lexical acquisition with simultaneous localization and mapping. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 811–818). https://doi.org/10.1109/IROS.2017.8202243. Taniguchi, A., Taniguchi, T., & Inamura, T. (2018a). Unsupervised spatial lexical acquisition by updating a language model with place clues. Robotics and Autonomous Systems, 99, 166–180. https://doi.org/10.1016/j.robot.2017.10.013. Taniguchi, T., Ugur, E., Hoffmann, M., Jamone, L., Nagai, T., Rosman, B., et al. (2018b). Symbol emergence in cognitive developmental systems: a survey. IEEE transactions on cognitive and developmental systems (pp. 1–1). https://doi.org/10.1109/TCDS.2018.2867772. Thrun, S., Burgard, W., & Fox, D. (2005). Probabilistic robotics. Cambridge: MIT Press. Ueda, R., Mizuta, K., Yamakawa, H., & Okada, H. (2016). Particle filter on episode for learning decision making rule. In Proceedings of the international conference on intelligent autonomous systems (IAS) (pp. 737–754). Springer Walter, M.R., Hemachandra, S., Homberg, B., Tellex, S., & Teller, S. (2013). Learning semantic maps from natural language descriptions. In Proceedings of robotics: science and systems (RSS). Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., & Torralba, A. (2018). Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1452–1464.