A survey of inverse reinforcement learning techniques

Emerald - 2012
ShaoZhifei1, ErMeng Joo1
1School of Electrical and Electronics Engineering, Nanyang Technological University, Singapore

Tóm tắt

Từ khóa


Tài liệu tham khảo

Abbeel, P. and Ng, A. (2004), “Apprenticeship learning via inverse reinforcement learning”,Proceedings of the 21st International Conference on Machine Learning, p. 1.

10.1177/0278364910371999

10.1007/978-3-642-00196-3_45

Abbeel, P., Coates, A., Quigley, M. and Ng, A. (2007), “An application of reinforcement learning to aerobatic helicopter flight”,Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference, p. 1.

10.1109/IROS.2008.4651222

10.1162/089976698300017746

10.1016/j.robot.2008.10.024

Babes, M., Marivate, V., Littman, M. and Subramanian, K. (2010), “Apprenticeship learning about multiple intentions”,Proceedings of International Conference on Machine Learning (ICML 2011).

Boularias, A. and Chaib‐Draa, B. (2011), “Bootstrapping apprenticeship learning”,Proceedings of Neural Information Processing Systems, 2010.

Boularias, A., Kober, J. and Peters, J. (2011), “Relative entropy inverse reinforcement learning”,Proceedings of Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2011), JMLR WC&P, Vol. 15, pp. 182‐9.

Chandramohan, S., Geist, M., Lefevre, F. and Pietquin, O. (2011), “User simulation in dialogue systems using inverse reinforcement learning”,Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence (Italy), August.

Choi, J. and Kim, K. (2009), “Inverse reinforcement learning in partially observable environments”,Proceedings of the 21st International Joint Conference on Artifical Intelligence (IJCAI ), pp. 1028‐33.

10.1109/IROS.2010.5649718

10.1145/1390156.1390175

10.1109/WI-IAT.2010.142

Dimitrakakis, C. and Rothkopf, C. (2011), “Bayesian multitask inverse reinforcement learning”,paper presented at the 9th European Workshop on Reinforcement Learning (EWRL 2011), Athens, Greece, 9‐11 September.

10.1086/257308

Grollman, D. and Billard, A. (2011), “Donut as I do: learning from failed demonstrations”,IEEE International Conference on Robotics and Automation, Shanghai, 9‐13 May, pp. 9‐13.

10.1109/ROBOT.1996.509162

10.1109/ROBOT.2010.5509772

Heskes, T. (1998), “Solving a huge number of similar tasks: a combination of multi‐task learning and a hierarchical Bayesian approach”,Proceedings of the 15th International Conference on Machine Learning (ICML'98), pp. 233‐41.

10.1103/PhysRev.108.171

Kaelbling, L., Littman, M. and Moore, A. (1996), “Reinforcement learning: a survey”,Journal of Artificial Intelligence Research, Vol. 4, pp. 237‐85.

10.1109/MRA.2010.936952

10.1007/978-3-642-04174-7_3

Mason, M. and Lopes, M. (2011), “Robot self‐initiative and personalization by learning through repeated interactions”,Proceedings of the 6th International Conference on Human‐robot Interaction, pp. 433‐40.

Murphy, K. (2000), “A survey of POMDP solution techniques”,Environment, Vol. 2, p. X3.

Neu, G. and Szepesvári, C. (2007), “Apprenticeship learning using inverse reinforcement learning and gradient methods”,Proceedings of the Twenty‐third Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI‐07), pp. 295‐302.

Ng, A. and Russell, S. (2000), “Algorithms for inverse reinforcement learning”,Proceedings of the Seventeenth International Conference on Machine Learning, pp. 663‐70.

10.1162/neco.1991.3.1.88

Qiao, Q. and Beling, P. (2011), “Inverse reinforcement learning with Gaussian process”,American Control Conference (ACC ), pp. 113‐18.

Ramachandran, D. and Amir, E. (2007), “Bayesian inverse reinforcement learning”,Proceedings of the 20th International Joint Conference on Artificial Intelligence.

Ratliff, N., Bagnell, J. and Zinkevich, M. (2006), “Maximum margin planning”,Proceedings of the 23rd International Conference on Machine Learning, pp. 729‐36.

10.1007/s10514-009-9121-3

10.1007/978-1-4612-0919-5_26

Rothkopf, C. and Dimitrakakis, C. (2011), “Preference elicitation and inverse reinforcement learning”,Proceedings of 22nd European Conference on Machine Learning ECML, Part III, LNAI 6913, pp. 34‐48.

10.1145/279943.279964

10.1016/S1364-6613(99)01327-3

10.1631/jzus.C0910486

Silva, V., Costa, A. and Lima, P. (2006), “Inverse reinforcement learning with evaluation”,IEEE International Conference on Robotics and Automation (ICRA06), Orlando, FL, USA, pp. 4246‐51.

10.1177/0278364910369715

10.1007/978-3-642-19457-3_26

10.1109/ROBOT.2010.5509832

10.1109/ROBOT.2010.5509336

Zhang, H. and Parkes, D. (2008), “Enabling environment design via active indirect elicitation”,Proceedings Workshop on Preference Handling, Chicago, IL.

Ziebart, B., Maas, A., Bagnell, J. and Dey, A. (2008), “Maximum entropy inverse reinforcement learning”,Proceedings 23rd AAAI Conference Artificial Intelligence, pp. 1433‐8.