Reinforcement learning for robot soccer

Autonomous Robots - 2009

Martin Riedmiller¹, Thomas Gabel¹, Roland Hafner¹, Sascha Lange¹

¹Department of Computer Science, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany

Tóm tắt

Từ khóa

Tài liệu tham khảo

Asada, M., Uchibe, E., & Hosoda, K. (1999). Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development. Artificial Intelligence, 110(2), 275–292.

Bagnell, J., & Schneider, J. (2001). Autonomous helicopter control using reinforcement learning policy search methods. In Proceedings of the 2001 IEEE international conference on robotics and automation (ICRA 2001) (pp. 1615–1620), Seoul, South Korea. New York: IEEE Press.

Behnke, S., Egorova, A., Gloye, A., Rojas, R., & Simon, M. (2003). Predicting away robot control latency. In D. Polani, B. Browning, A. Bonarini, & K. Yoshida (Eds.), LNCS. RoboCup 2003: robot soccer world cup VII (pp. 712–719), Padua, Italy. Berlin: Springer.

Bellman, R. (1957). Dynamic programming. Princeton: Princeton University Press.

Bertsekas, D., & Tsitsiklis, J. (1996). Neuro dynamic programming. Belmont: Athena Scientific.

Chernova, S., & Veloso, M. (2004). An evolutionary approach to gait learning for four-legged robots. In Proceedings of the 2004 IEEE/RSJ international conference on intelligent robots and systems (IROS 2004), Sendai, Japan. New York: IEEE Press.

Crites, R., & Barto, A. (1995). Improving elevator performance using reinforcement learning. In Advances in neural information processing systems 8 (NIPS 1995) (pp. 1017–1023), Denver, USA. Cambridge: MIT Press.

Ernst, D., Geurts, P., & Wehenkel, L. (2006). Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6(1), 503–556.

Gabel, T., & Riedmiller, M. (2007). Adaptive reactive job-shop scheduling with learning agents. International Journal of Information Technology and Intelligent Computing, 2(4).

Gabel, T., Hafner, R., Lange, S., Lauer, M., & Riedmiller, M. (2006). Bridging the gap: learning in the RoboCup simulation and midsize league. In Proceedings of the 7th Portuguese conference on automatic control (Controlo 2006), Porto, Portugal.

Gabel, T., Riedmiller, M., & Trost, F. (2008). A case study on improving defense behavior in soccer simulation 2D: the NeuroHassle approach. In Iocchi, L., Matsubara, H., Weitzenfeld, A., & Zhou, C. (Eds.), LNCS. RoboCup 2008: robot soccer world cup XII, Suzhou, China. Berlin: Springer.

Gordon, G., Prieditis, A., & Russell, S. (1995). Stable function approximation in dynamic programming. In Proceedings of the twelfth international conference on machine learning (ICML 1995) (pp. 261–268), Tahoe City, USA. San Mateo: Morgan Kaufmann.

Hafner, R., & Riedmiller, M. (2007). Neural reinforcement learning controllers for a real robot application. In Proceedings of the IEEE international conference on robotics and automation (ICRA 07), Rome, Italy. New York: IEEE Press.

Kaufmann, U., Mayer, G., Kraetzschmar, G., & Palm, G. (2004). Visual robot detection in RoboCup using neural networks. In D. Nardi, M. Riedmiller, C. Sammut, & J. Santos-Victor (Eds.), LNCS. RoboCup 2004: robot soccer world cup VIII (pp. 310–322), Porto, Portugal. Berlin: Springer.

Kitano, H. (Ed.). (1997). RoboCup-97: robot soccer world cup I. Berlin: Springer.

Kober, J., Mohler, B., & Peters, J. (2008). Learning perceptual coupling for motor primitives. In Proceedings of the 2008 IEEE/RSJ international conference on intelligent robots and systems (IROS 2008) (pp. 834–839), Nice, France. New York: IEEE Press.

Lagoudakis, M., & Parr, R. (2003). Least-squares policy iteration. Journal of Machine Learning Research, 4, 1107–1149.

Lauer, M., Lange, S., & Riedmiller, M. (2005). Calculating the perfect match: an efficient and accurate approach for robot self-localization. In A. Bredenfeld, A. Jacoff, I. Noda, & Y. Takahashi (Eds.), LNCS. RoboCup 2005: robot soccer world cup IX (pp. 142–153), Osaka, Japan. Berlin: Springer.

Lauer, M., Lange, S., & Riedmiller, M. (2006). Motion estimation of moving objects for autonomous mobile robots. Kunstliche Intelligenz, 20(1), 11–17.

Li, B., Hu, H., & Spacek, L. (2003). An adaptive color segmentation algorithm for Sony legged robots. In The 21st IASTED international multi-conference on applied informatics (AI 2003) (pp. 126–131), Innsbruck, Austria. New York: IASTED/ACTA Press.

Lin, L. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3), 293–321.

Ma, J., & Cameron, S. (2008). Combining policy search with planning in multi-agent cooperation. In L. Iocchi, H. Matsubara, A. Weitzenfeld, & C. Zhou (Eds.), LNAI. RoboCup 2008: robot soccer world cup XII, Suzhou, China. Berlin: Springer.

Nakashima, T., Takatani, M., Udo, M., Ishibuchi, H., & Nii, M. (2005). Performance evaluation of an evolutionary method for RoboCup soccer strategies. In A. Bredenfeld, A. Jacoff, I. Noda, & Y. Takahashi (Eds.), LNAI. RoboCup 2005: robot soccer world cup IX, Osaka, Japan. Berlin: Springer.

Ng, A., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., & Liang, E. (2004). Autonomous inverted helicopter flight via reinforcement learning. In Experimental robotics IX, the 9th international symposium on experimental robotics (ISER) (pp. 363–372), Singapore, China. Berlin: Springer.

Noda, I., Matsubara, H., Hiraki, K., & Frank, I. (1998). Soccer server: a tool for research on multi-agent systems. Applied Artificial Intelligence, 12(2–3), 233–250.

Ogino, M., Katoh, Y., Aono, M., Asada, M., & Hosoda, K. (2004). Reinforcement learning of humanoid rhythmic walking parameters based on visual information. Advanced Robotics, 18(7), 677–697.

Oubbati, M., Schanz, M., & Levi, P. (2005). Kinematic and dynamic adaptive control of a nonholonomic mobile robot using a RNN. In Proceedings of the 20005 IEEE international symposium on computational intelligence in robotics and automation (CIRA 2005) (pp. 27–33). New York: IEEE Press.

Peters, J., & Schaal, S. (2006). Policy gradient methods for robotics. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Beijing, China. New York: IEEE Press.

Peters, J., & Schaal, S. (2008a). Learning to control in operational space. The International Journal of Robotics Research, 27(2), 197–212.

Peters, J., & Schaal, S. (2008b). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.

Puterman, M. (2005). Markov decision processes: discrete stochastic dynamic programming. New York: Wiley-Interscience.

Riedmiller, M. (1997). Generating continuous control signals for reinforcement controllers using dynamic output elements. In Proceedings of the European symposium on artificial neural networks (ESANN 1997), Bruges, Belgium.

Riedmiller, M. (2005). Neural fitted Q iteration—first experiences with a data efficient neural reinforcement learning method. In Machine learning: ECML 2005, 16th European conference on machine learning, Porto, Portugal. Berlin: Springer.

Riedmiller, M., & Braun, H., (1993). A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In H. Ruspini (Ed.), Proceedings of the IEEE international conference on neural networks (ICNN) (pp. 586–591), San Francisco.

Riedmiller, M., & Merke, A. (2003). Using machine learning techniques in complex multi-agent domains. In I. Stamatescu, W. Menzel, M. Richter, & U. Ratsch (Eds.), Adaptivity and learning. Berlin: Springer.

Riedmiller, M., Montemerlo, M., & Dahlkamp, H. (2007). Learning to drive in 20 minutes. In Proceedings of the FBIT 2007 conference, Jeju, Korea. Berlin: Springer.

Röfer, T. (2004). Evolutionary gait-optimization using a fitness function based on proprioception. In Nardi, D., Riedmiller, M., Sammut, C., & Santos-Victor, J. (Eds.), LNCS. RoboCup 2004: robot soccer world cup VIII (pp. 310–322), Porto, Portugal. Berlin: Springer.

Stone, P., Sutton, R., & Kuhlmann, G. (2005). Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3), 165–188.

Sutton, R., & Barto, A. (1998). Reinforcement learning. An introduction. Cambridge: MIT Press/A Bradford Book.

Sutton, R., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems 12 (NIPS 1999) (pp. 1057–1063), Denver, USA. Cambridge: MIT Press.

Tesauro, G., & Galpering, G. (1995). On-line policy improvement using Monte Carlo search. In Neural information processing systems (NIPS 1996) (pp. 206–221), Denver, USA. Berlin: Springer.

Tesauro, G., & Sejnowski, T. (1989). A parallel network that learns to play backgammon. Artificial Intelligence, 39(3), 357–390.

Treptow, A., & Zell, A. (2004). Real-time object tracking for soccer-robots without color information. Robotics and Autonomous Systems, 48(1), 41–48.

Watkins, C., & Dayan, P. (1992). Q-learning. Machine Learning, 8, 279–292.

Wehenkel, L., Glavic, M., & Ernst, D. (2005). New developments in the application of automatic learning to power system control. In Proceedings of the 15th power systems computation conference (PSCC05), Liege, Belgium.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA