A survey of inverse reinforcement learning techniques
Tóm tắt
Từ khóa
Tài liệu tham khảo
Abbeel, P. and Ng, A. (2004), “Apprenticeship learning via inverse reinforcement learning”,Proceedings of the 21st International Conference on Machine Learning, p. 1.
Abbeel, P., Coates, A., Quigley, M. and Ng, A. (2007), “An application of reinforcement learning to aerobatic helicopter flight”,Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference, p. 1.
Babes, M., Marivate, V., Littman, M. and Subramanian, K. (2010), “Apprenticeship learning about multiple intentions”,Proceedings of International Conference on Machine Learning (ICML 2011).
Boularias, A. and Chaib‐Draa, B. (2011), “Bootstrapping apprenticeship learning”,Proceedings of Neural Information Processing Systems, 2010.
Boularias, A., Kober, J. and Peters, J. (2011), “Relative entropy inverse reinforcement learning”,Proceedings of Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011), JMLR WC&P, Vol. 15, pp. 182‐9.
Chandramohan, S., Geist, M., Lefevre, F. and Pietquin, O. (2011), “User simulation in dialogue systems using inverse reinforcement learning”,Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence (Italy), August.
Choi, J. and Kim, K. (2009), “Inverse reinforcement learning in partially observable environments”,Proceedings of the 21st International Joint Conference on Artifical Intelligence (IJCAI ), pp. 1028‐33.
Dimitrakakis, C. and Rothkopf, C. (2011), “Bayesian multitask inverse reinforcement learning”,paper presented at the 9th European Workshop on Reinforcement Learning (EWRL 2011), Athens, Greece, 9‐11 September.
Grollman, D. and Billard, A. (2011), “Donut as I do: learning from failed demonstrations”,IEEE International Conference on Robotics and Automation, Shanghai, 9‐13 May, pp. 9‐13.
Heskes, T. (1998), “Solving a huge number of similar tasks: a combination of multi‐task learning and a hierarchical Bayesian approach”,Proceedings of the 15th International Conference on Machine Learning (ICML'98), pp. 233‐41.
Kaelbling, L., Littman, M. and Moore, A. (1996), “Reinforcement learning: a survey”,Journal of Artificial Intelligence Research, Vol. 4, pp. 237‐85.
Mason, M. and Lopes, M. (2011), “Robot self‐initiative and personalization by learning through repeated interactions”,Proceedings of the 6th International Conference on Human‐robot Interaction, pp. 433‐40.
Murphy, K. (2000), “A survey of POMDP solution techniques”,Environment, Vol. 2, p. X3.
Neu, G. and Szepesvári, C. (2007), “Apprenticeship learning using inverse reinforcement learning and gradient methods”,Proceedings of the Twenty‐third Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI‐07), pp. 295‐302.
Ng, A. and Russell, S. (2000), “Algorithms for inverse reinforcement learning”,Proceedings of the Seventeenth International Conference on Machine Learning, pp. 663‐70.
Qiao, Q. and Beling, P. (2011), “Inverse reinforcement learning with Gaussian process”,American Control Conference (ACC ), pp. 113‐18.
Ramachandran, D. and Amir, E. (2007), “Bayesian inverse reinforcement learning”,Proceedings of the 20th International Joint Conference on Artificial Intelligence.
Ratliff, N., Bagnell, J. and Zinkevich, M. (2006), “Maximum margin planning”,Proceedings of the 23rd International Conference on Machine Learning, pp. 729‐36.
Rothkopf, C. and Dimitrakakis, C. (2011), “Preference elicitation and inverse reinforcement learning”,Proceedings of 22nd European Conference on Machine Learning ECML, Part III, LNAI 6913, pp. 34‐48.
Silva, V., Costa, A. and Lima, P. (2006), “Inverse reinforcement learning with evaluation”,IEEE International Conference on Robotics and Automation (ICRA06), Orlando, FL, USA, pp. 4246‐51.
Zhang, H. and Parkes, D. (2008), “Enabling environment design via active indirect elicitation”,Proceedings Workshop on Preference Handling, Chicago, IL.
Ziebart, B., Maas, A., Bagnell, J. and Dey, A. (2008), “Maximum entropy inverse reinforcement learning”,Proceedings 23rd AAAI Conference Artificial Intelligence, pp. 1433‐8.
