Learning Intelligent Behavior in a Non-stationary and Partially Observable Environment

Artificial Intelligence Review - Tập 18 - Trang 97-115 - 2002
SelÇuk şenkul1, Faruk Polat1
1Computer Engineering Department, Middle East Technical University, Ankara, Turkey

Tóm tắt

Individual learning in an environment where more than one agent exist is a chal-lengingtask. In this paper, a single learning agent situated in an environment where multipleagents exist is modeled based on reinforcement learning. The environment is non-stationaryand partially accessible from an agents' point of view. Therefore, learning activities of anagent is influenced by actions of other cooperative or competitive agents in the environment.A prey-hunter capture game that has the above characteristics is defined and experimentedto simulate the learning process of individual agents. Experimental results show that thereare no strict rules for reinforcement learning. We suggest two new methods to improve theperformance of agents. These methods decrease the number of states while keeping as muchstate as necessary.

Từ khóa

Tài liệu tham khảo

Abul, O., Polat, F. & Alhajj, R. (2000). Multi-Agent Reinforcement Learning Using Function Approximation. IEEE Transaction on Systems, Man and Cybernetics 30(4): 485–497.

Bellmann, R. E. (1957). Dynamic Programming. Princeton, NJ: Princeton University Press.

Ellis, H. C. (1972). Fundamentals of Human Learning and Cognition. Dubuque, Iowa: WM.C. Brown company Publishers.

Estes, W. K. (1970). Learning Theory and Mental Development. New York, NY: Academic Press.

Howard, R. A. (1960). Dynamic Programming and Markov Processes. Cambridge, MA: The MIT Press.

Hu, J. & Wellman, M. P. (1998). Multi-Agent Reinforcement Learning: Theoretical Frame-work and an Algorithm. Proc.of Int.Conf.on Machine Learning, 242–250.

Hu, J. & Wellman, M. P. (1998). Multiagent Reinforcement Learning and Stochastic Games. Games and Economic Behavior.

Hulse, S. H., Egeth, H. & Deese, J. (1984). The Psychology of Learning. McGraw-Hill.

Kaelbling, L. P., Littman, M. L. & Moore, A. W. (1996). Reinfocement Learning: A Survey. Journal of Artificial Intelligence Research 4: 237–285.

Kaelbling, L. P. et al. (1998). Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence 101.

Keller, F. S. (1969). Reinforcement Theory. New York, NY: Random House.

Kodratoff, Y. (1998). Introduction to Machine Learning. Morgan Kaufmann.

Kuter, U. & Polat, F. (2000). Learning Better in Dynamic, Partially Observable Environ-ment. In Lindemann, G. (ed.) Proc.of European Conf.on Artificial Intelligence (ECAI) Workshop on Modeling Artificial Societies and Hybrid Organization, 50–68. Berlin, Aug. 20-25.

Langley, P. (1995). Elements of Machine Learning. Morgan Kaufman

Littman, M. L., Cassandra, A. R. & Kaelbling, L. P. (1995). Learning Policies for Partially Observable Environments: Scaling up. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 495–503. Morgan Kaufman.

Minsky, M. (1961). Steps towards Artificial Intelligence. Proceedings of IR E, 8–30. Reprinted in Feigenbaum, E. A. & Feldman, J. (eds.) Computers and Thought, 406-450. New York, NY: McGraw-Hill.

Mitchell, T. M. (1997). Machine Learning. New York, NY: McGraw-Hill.

Polat, F, Guvenir, S. & Shekhar, S. (1993). A Negotiation Platform for Cooperating Multi-Agent Systems. International Journal of Concurrent Engineering:Research & Applica-tions 3: 179–187.

Polat, F. & Guvenir, A. (1994). A Conflict Resolution Based Decentralized Multi-Agent Problem Solving Model. Artificial Social Systems, LNAI 130, 279–294. Springer-Verlag.

Russel, S. J. & Norvig, P. (1997). Artificial Intelligence: A Modern Approach. Englewood Cliffs, NJ: Prentice-Hall International, Inc.

Sen, S., Sekaran, M. & Hale, J. (1994). Learning to Coordinate without Sharing Information. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 509–514. Morgan Kaufman.

Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning: An Introduction. MIT Press.

Tan, M. (1993). Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. In Huhns, M. N. & Singh, M. P. (eds.) Readings in Agents, 487–494. Morgan Kaufman.

Turing, A. M. (1950). Computing Machinery and Intelligence. Mind 95: 433–460. Reprinted in Mind design I I, 29-6. Cambridge, MA: MIT Press.

Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD Thesis, University of Cambridge, England.

Watkins, C. J. C. H. & Dayan, P. (1992). Technical Note: Q-Learning. Machine Learning 8: 279–292.

Wei, G. (1996). Adaptation and Learning in Multi-Agent Systems: Some Remarks and a Bibliography. In Weiss, G. and Sen, S. (eds.) Adaption and Learning in Multi-Agent Systems. Berlin: Springer.

Weiss, G. (1999). Multi-Agent Systems: A Modern Approach to Distributed Artificial Intelli-gence, 28–77. Mit Press.