Learning Intelligent Behavior in a Non-stationary and Partially Observable Environment

Artificial Intelligence Review - Tập 18 - Trang 97-115 - 2002

SelÇuk şenkul¹, Faruk Polat¹

¹Computer Engineering Department, Middle East Technical University, Ankara, Turkey

Tóm tắt

Individual learning in an environment where more than one agent exist is a chal-lengingtask. In this paper, a single learning agent situated in an environment where multipleagents exist is modeled based on reinforcement learning. The environment is non-stationaryand partially accessible from an agents' point of view. Therefore, learning activities of anagent is influenced by actions of other cooperative or competitive agents in the environment.A prey-hunter capture game that has the above characteristics is defined and experimentedto simulate the learning process of individual agents. Experimental results show that thereare no strict rules for reinforcement learning. We suggest two new methods to improve theperformance of agents. These methods decrease the number of states while keeping as muchstate as necessary.

Từ khóa

Tài liệu tham khảo

Abul, O., Polat, F. & Alhajj, R. (2000). Multi-Agent Reinforcement Learning Using Function Approximation. IEEE Transaction on Systems, Man and Cybernetics 30(4): 485–497.

Bellmann, R. E. (1957). Dynamic Programming. Princeton, NJ: Princeton University Press.

Ellis, H. C. (1972). Fundamentals of Human Learning and Cognition. Dubuque, Iowa: WM.C. Brown company Publishers.

Estes, W. K. (1970). Learning Theory and Mental Development. New York, NY: Academic Press.

Howard, R. A. (1960). Dynamic Programming and Markov Processes. Cambridge, MA: The MIT Press.

Hu, J. & Wellman, M. P. (1998). Multi-Agent Reinforcement Learning: Theoretical Frame-work and an Algorithm. Proc.of Int.Conf.on Machine Learning, 242–250.

Hu, J. & Wellman, M. P. (1998). Multiagent Reinforcement Learning and Stochastic Games. Games and Economic Behavior.

Hulse, S. H., Egeth, H. & Deese, J. (1984). The Psychology of Learning. McGraw-Hill.

Kaelbling, L. P., Littman, M. L. & Moore, A. W. (1996). Reinfocement Learning: A Survey. Journal of Artificial Intelligence Research 4: 237–285.

Kaelbling, L. P. et al. (1998). Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence 101.

Keller, F. S. (1969). Reinforcement Theory. New York, NY: Random House.

Kodratoff, Y. (1998). Introduction to Machine Learning. Morgan Kaufmann.

Kuter, U. & Polat, F. (2000). Learning Better in Dynamic, Partially Observable Environ-ment. In Lindemann, G. (ed.) Proc.of European Conf.on Artificial Intelligence (ECAI) Workshop on Modeling Artificial Societies and Hybrid Organization, 50–68. Berlin, Aug. 20-25.

Langley, P. (1995). Elements of Machine Learning. Morgan Kaufman

Littman, M. L., Cassandra, A. R. & Kaelbling, L. P. (1995). Learning Policies for Partially Observable Environments: Scaling up. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 495–503. Morgan Kaufman.

Minsky, M. (1961). Steps towards Artificial Intelligence. Proceedings of IR E, 8–30. Reprinted in Feigenbaum, E. A. & Feldman, J. (eds.) Computers and Thought, 406-450. New York, NY: McGraw-Hill.

Mitchell, T. M. (1997). Machine Learning. New York, NY: McGraw-Hill.

Polat, F, Guvenir, S. & Shekhar, S. (1993). A Negotiation Platform for Cooperating Multi-Agent Systems. International Journal of Concurrent Engineering:Research & Applica-tions 3: 179–187.

Polat, F. & Guvenir, A. (1994). A Conflict Resolution Based Decentralized Multi-Agent Problem Solving Model. Artificial Social Systems, LNAI 130, 279–294. Springer-Verlag.

Russel, S. J. & Norvig, P. (1997). Artificial Intelligence: A Modern Approach. Englewood Cliffs, NJ: Prentice-Hall International, Inc.

Sen, S., Sekaran, M. & Hale, J. (1994). Learning to Coordinate without Sharing Information. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 509–514. Morgan Kaufman.

Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.

Tan, M. (1993). Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 487–494. Morgan Kaufman.

Turing, A. M. (1950). Computing Machinery and Intelligence. Mind 95: 433–460. Reprinted in Mind design I I, 29-6. Cambridge, MA: MIT Press.

Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD Thesis, University of Cambridge, England.

Watkins, C. J. C. H. & Dayan, P. (1992). Technical Note: Q-Learning. Machine Learning 8: 279–292.

Wei, G. (1996). Adaptation and Learning in Multi-Agent Systems: Some Remarks and a Bibliography. In Weiss, G. and Sen, S. (eds.) Adaption and Learning in Multi-Agent Systems. Berlin: Springer.

Weiss, G. (1999). Multi-Agent Systems: A Modern Approach to Distributed Artificial Intelli-gence, 28–77. Mit Press.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA