Analyzing and visualizing multiagent rewards in dynamic and stochastic domains

Adrian Agogino1, Kagan Tumer2
1University of California Santa Cruz , Santa Cruz, USA
2Oregon State University, 204 Rogers Hall, Corvallis, OR, 97330, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Agogino, A., Martin, C., & Ghosh, J. (1998). Principal curve classifier—A nonlinear approach to pattern classification. In Proceedings of International Joint Conference on Neural Networks, Anchorage, Alaska.

Agogino, A., Martin, C., & Ghosh, J. (1999). Visualization of radial basis function networks. In Proceedings of International Joint Conference on Neural Networks. Washington, DC.

Agogino, A., & Tumer, K. (2004). Efficient evaluation functions for multi-rover systems. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2004) (pp. 1–12). Seattle, WA.

Agogino, A., & Tumer, K. (2005). Multi agent reward analysis for learning in noisy domains. In Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multi-Agent Systems, Utrecht, Netherlands.

Baird, L., & Moore, A. (1999). Gradient descent for general reinforcement learning. In Advances in Neural Information Processing Systems (NIPS) (pp. 968–974). Cambridge, MA.

Bishof, H., Pinz, A., & Kropatsch, W. G. (1992). Visualization methods for neural networks. In 11th International Conference on Pattern Recognition (pp. 581–585). The Hague, Netherlands.

Bishop C.M. (1995). Neural networks for pattern recognition. Oxford University Press, New York

Chalkiadakis, G., & Boutilier, C. (2003). Coordination in multiagent reinforcement learning: A Bayesian approach. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS-03), Melbourne, Australia.

Crites R.H. and Barto A.G. (1996). Improving elevator performance using reinforcement learning. In: Touretzky, D.S., Mozer, M.C. and Hasselmo, M.E. (eds) Advances in neural information processing systems-8, pp 1017–1023. MIT Press, Cambridge, MA

Excelente-Toledo C.B. and Jennings N.R. (2004). The dynamic selection of coordination mechanisms. Journal of Autonomous Agents and Multi-Agent Systems 9(1–2): 55–85

Gallagher, M., & Downs, T. (1997). Visualization of learning in neural networks using principal component analysis. In International Conference on Computational Intelligence and Multimedia Applications (pp. 327–331).

Guestrin, C., Hauskrecht, M., & Kveton, B. (2004). Solving factored MDPs with continuous and discrete variables. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (pp. 235–242).

Guestrin, C., Koller, D., & Parr, R. (2001a). Max-norm projections for factored MDPs. In Proceedings of the International Joint Conference on Artificial Intelligence.

Guestrin, C., Koller, D., & Parr, R. (2001b). Multiagent planning with factored MDPs. In NIPS-14.

Guestrin, C., Lagoudakis, M., & Parr, R. (2002). Coordinated reinforcement learning. In Proceedings of the 19th International Conference on Machine Learning.

Hinton G. (1986). Connectionist learning procedures. Artificial Intelligence 40: 185–234

Hoen, P., Redekar, H. L. P. G., & Robu, V. (2004). Simulation and visualization of a market-based model for logistics management in transportation. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multi-Agent Systems (pp. 1218–1219). New York, NY.

Hu, J., & Wellman, M. P. (1998). Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the Fifteenth International Conference on Machine Learning (pp. 242–250).

Jolliffe I. (2002). Principal component analysis (2nd ed). Springer, New York

Kearns, M., & Koller, D. (1999). Efficient reinforcement learning in factored MDPs. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (pp. 740–747).

Mataric, M. J. (1998). Coordination and learning in multi-robot systems. In IEEE Intelligent Systems (pp. 6–8).

Stone P. and Veloso M. (2000). Multiagent systems: A survey from a machine learning perspective. Autonomous Robots 8(3): 345–383

Sutton R.S. and Barto A.G. (1998). Reinforcement learning: An introduction. MIT Press, Cambridge, MA

Tumer, K. (2005). Designing agent utilities for coordinated, scalable and robust multi-agent systems. In Scerri, P. Mailler, R., & R. Vincent (Eds.), Challenges in the coordination of large scale multiagent Systems. Springer (to appear).

Tumer, K., & Agogino, A. (2007). Distributed agent-based air traffic flow management. In Proceedings of the Sixth International Joint Conference on Autonomous Agents and Multi-Agent Systems (pp. 330–337). Honolulu, HI (Best paper award).

Tumer, K., Agogino, A., & Wolpert, D. (2002). Learning sequences of actions in collectives of autonomous agents. In Proceedings of the First International Joint Conference on Autonomous Agents and Multi-Agent Systems, Bologna, Italy (pp. 378–385).

Tumer K. and Wolpert D. (Eds). (2004a). Collectives and the design of complex systems. Springer, New York

Tumer, K., & Wolpert, D. (2004b). A survey of collectives. In Collectives and the design of complex systems (pp. 1–42). Springer.

Tumer, K., & Wolpert, D. H. (2000). Collective intelligence and Braess Paradox. In Proceedings of the Seventeeth National Conference on Artificial Intelligence (pp. 104–109).

Wejchert J. and Tesauro G. (1991). Visualizing processes in neural networks. IBM Journal of Research and Development 35: 244–253

Wolpert D.H. and Tumer K. (2001). Optimal payoff functions for members of collectives. Advances in Complex Systems 4(2/3): 265–279

Wolpert D.H., Tumer K. and Bandari E. (2004). Improving search algorithms by using intelligent coordinates. Physical Review E 69: 017701