Scalable lifelong reinforcement learning

Pattern Recognition - Tập 72 - Trang 407-418 - 2017

Yusen Zhan¹, Haitham Bou Ammar², Matthew E. Taylor¹

¹The School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99163, USA

²Prowler i.o., Cambridge, United Kingdom

Tài liệu tham khảo

Kober, 2009, Policy search for motor primitives in robotics, 849 Murphy, 2007, Methodological challenges in constructing effective treatment sequences for chronic psychiatric disorders, Neuropsychopharmacology, 32, 257, 10.1038/sj.npp.1301241 Pineau, 2007, Constructing evidence-based treatment strategies using methods from computer science, Drug Alcohol Depend., 88, S52, 10.1016/j.drugalcdep.2007.01.005 Sutton, 1998 Wilson, 2007, Multi-task reinforcement learning: a hierarchical Bayesian approach, 1015 Taylor, 2009, Transfer learning for reinforcement learning domains: a survey, J. Mach. Learn. Res., 10, 1633 Lazaric, 2010, Bayesian multi-task reinforcement learning Li, 2009, Multi-task reinforcement learning in partially observable stochastic environments, J. Mach. Learn. Res., 10, 1131 Bou-Ammar, 2014, Online multi-task learning for policy gradient methods Williams, 1992, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Mach. Learn., 8, 229, 10.1007/BF00992696 Bhatnagar, 2009, Natural actor–critic algorithms, Automatica, 45, 2471, 10.1016/j.automatica.2009.07.008 Peters, 2008, Natural actor-critic, Neurocomputing, 71, 1180, 10.1016/j.neucom.2007.11.026 Ruvolo, 2013, Ella: an efficient lifelong learning algorithm Thrun, 1996, Discovering structure in multiple learning tasks: the TC algorithm Caarls, 2016, Parallel online temporal difference learning for motor control, IEEE Trans. Neural Netw. Learn. Syst., 27, 1457, 10.1109/TNNLS.2015.2442233 S. Gu, E. Holly, T. Lillicrap, S. Levine, Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates, arXiv preprintarXiv:1610.00633 (2016). A. Yahya, A. Li, M. Kalakrishnan, Y. Chebotar, S. Levine, Collective robot reinforcement learning with distributed asynchronous guided policy search, arXiv preprintarXiv:1610.00673(2016). Levine, 2016, End-to-end training of deep visuomotor policies, J. Mach. Learn. Res., 17, 1 Deisenroth, 2014, Multi-task policy search for robotics, 3876 Wilson, 2007, Multi-task reinforcement learning: ahierarchical Bayesian approach Snel, 2014, Learning potential functions and their representations for multi-task reinforcement learning, Auton. Agent Multi Agent Syst., 28, 637, 10.1007/s10458-013-9235-z Kumar, 2012, Learning task grouping and overlap in multi-task learning, 1383 Bou Ammar, 2015, Autonomous cross-domain knowledge transfer in lifelong policy gradient reinforcement learning Boyd, 2011, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn., 3, 1, 10.1561/2200000016 Wei, 2012, Distributed alternating direction method of multipliers, 5445 Tibshiranit, 1996, Regression shrinkage and selection via the Lasso, J. R. Stat. Soc. Series B (Methodological), 58, pp.267 Peters, 2008, Natural actor-critic, Neurocomputing, 71, 10.1016/j.neucom.2007.11.026

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA