Deep Reinforcement Learning for Autonomous Driving: A Survey
Tóm tắt
Từ khóa
Tài liệu tham khảo
Sutton, 2018, Reinforcement Learning: An Introduction
Team, 2020, Dimensions Publication Trends
Russell, 2009, Artificial Intelligence: A Modern Approach
Hong, 2018, Diversity-driven exploration strategy for deep reinforcement learning, Advances in Neural Information Processing Systems, 31, 10489
van Otterlo, 2012, Reinforcement Learning: State-of-the-Art
Watkins, 1989, Learning from delayed rewards
Silver, Deterministic policy gradient algorithms, Proc. ICML, 387
Schulman, Trust region policy optimization, Proc. Int. Conf. Mach. Learn., 1889
Schulman, 2017, Proximal policy optimization algorithms, arXiv:1707.06347
Lillicrap, Continuous control with deep reinforcement learning, Proc. 4th Int. Conf. Learn. Represent. (ICLR), 1
Mnih, Asynchronous methods for deep reinforcement learning, Proc. Int. Conf. Mach. Learn., 1928
Haarnoja, Reinforcement learning with deep energy-based policies, Proc. 34th Int. Conf. Mach. Learn. (JMLR), 70, 1352
Haarnoja, 2018, Soft actor-critic algorithms and applications, arXiv:1812.05905
Rummery, 1994, On-line Q-learning using connectionist systems
Sutton, 2018, Reinforcement Learning an Introduction
Bellman, 1957, Dynamic Programming
Schaul, 2015, Prioritized experience replay, arXiv:1511.05952
Wang, 2015, Dueling network architectures for deep reinforcement learning, arXiv:1511.06581
Hausknecht, 2015, Deep recurrent Q-learning for partially observable MDPs, arXiv:1507.06527
Skinner, 1938, The Behavior of Organisms: An Experimental Analysis
Randløv, Learning to drive a bicycle using reinforcement learning and shaping, Proc. 15th Int. Conf. Mach. Learn., 463
Ng, Policy invariance under reward transformations: Theory and application to reward shaping, Proc. 16th Int. Conf. Mach. Learn., 278
Devlin, Theoretical considerations of potential-based reward shaping for multi-agent systems, Proc. 10th Int. Conf. Auto. Agents Multiagent Syst. (AAMAS), 225
Mannion, A theoretical and empirical analysis of reward transformations in multi-objective stochastic games, Proc. 16th Int. Conf. Auto. Agents Multiagent Syst. (AAMAS), 1
Mannion, Multi-objective dynamic dispatch optimisation using multi-agent reinforcement learning, Proc. 15th Int. Conf. Auto. Agents Multiagent Syst. (AAMAS), 1345
Mason, Applying multi-agent reinforcement learning to watershed management, Proc. Adapt. Learn. Agents Workshop (AAMAS), 1
Pareto, 1906, Manual Political Economy
Raffin, 2019, Decoupling feature extraction from policy learning: Assessing benefits of state representation learning in goal based robotics, arXiv:1901.08651
Kang, Policy optimization with demonstrations, Proc. Int. Conf. Mach. Learn., 2474
Ibrahim, End-to-end framework for fast learning asynchronous agents, Proc. 32nd Conf. Neural Inf. Process. Syst., Imitation Learn. Challenges Robot. Workshop (NeurIPS)
Ng, Algorithms for inverse reinforcement learning, Proc. ICML, 2
Ho, Generative adversarial imitation learning, Proc. Adv. Neural Inf. Process. Syst., 4565
Leurent, 2018, A survey of state-action representations for autonomous driving
Dosovitskiy, CARLA: An open urban driving simulator, Proc. 1st Annu. Conf. Robot Learn., 1
Li, Urban driving with multi-objective deep reinforcement learning, Proc. 18th Int. Conf. Auto. Agents MultiAgent Syst., 359
Kardell, 2017, Autonomous vehicle control via deep reinforcement learning
Sallab, End-to-end deep reinforcement learning for lane keeping assist, Proc. MLITS, NIPS Workshop, 2, 1
Keselman, 2018, Reinforcement learning with A* and a deep heuristic, arXiv:1811.07745
Zhan, 2019, INTERACTION dataset: An INTERnational, adversarial and cooperative moTION dataset in interactive driving scenarios with semantic maps, arXiv:1910.03088
Watter, Embed to control: A locally linear latent dynamics model for control from raw images, Proc. Adv. Neural Inf. Process. Syst., 2746
Chiappa, Recurrent environment simulators, Proc. 5th Int. Conf. Learn. Represent., ICLR, 1
Mania, Simple random search of static linear policies is competitive for reinforcement learning, Proc. 31st Annu. Conf. Adv. Neural Inf. Process. Syst. (NeurIPS), 1800
Wymann, 2000, Torcs, the Open Racing Car Simulator
Quiter, 2018, Deepdrive/Deepdrive: 2.0
2019, Drive Constellation Now Available
Santara, 2019, Multi-Agent Autonomous Driving Simulator Built on Top of TORCS
Wu, 2017, Flow: Architecture and benchmarking for reinforcement learning in traffic control, arXiv:1710.05465
Leurent, 2019, A Collection of Environments for Autonomous Driving and Tactical Decision-Making Tasks
German Ros, 2019, Carla Autonomous Driving Challenge
Najm, 2007, Pre-crash scenario typology for crash avoidance research
Pomerleau, Alvinn: An autonomous land vehicle in a neural network, Proc. Adv. Neural Inf. Process. Syst., 1
Bojarski, End to end learning for self-driving cars, Proc. NIPS Deep Learn. Symp., 1
Bojarski, 2017, Explaining how a deep neural network trained with End-to-End learning steers a car, arXiv:1704.07911
Sharifzadeh, Learning to drive using inverse reinforcement learning and deep Q-networks, Proc. NIPS Workshops, 1
Wang, Sample efficient actor-critic with experience replay, Proc. 5th Int. Conf. Learn. Represent., ICLR, 1
Liaw, 2017, Composing meta-policies for autonomous driving using hierarchical deep reinforcement learning, arXiv:1711.01503
Taylor, 2009, Transfer learning for reinforcement learning domains: A survey, J. Mach. Learn. Res., 10, 1633
Isele, Transferring autonomous driving knowledge on simulated and real intersections, Proc. Lifelong Learn., Reinforcement Learn. Approach, ICML Workshop (NeurIPS)
Wang, Learning to reinforcement learn, Proc. Complete CogSci
Duan, 2016, RL2: Fast reinforcement learning via slow reinforcement learning, arXiv:1611.02779
Finn, Model-agnostic meta-learning for fast adaptation of deep networks, Proc. 34th Int. Conf. Mach. Learn., 70, 1126
Nichol, 2018, On first-order meta-learning algorithms, arXiv:1803.02999
Al-Shedivat, Continuous adaptation via meta-learning in nonstationary and competitive environments, Proc. 6th Int. Conf. Learn. Represent., ICLR, 1
Ha, Recurrent world models facilitate policy evolution, Proc. Adv. Neural Inf. Process. Syst., 1
Ross, Efficient reductions for imitation learning, Proc. 13th Int. Conf. Artif. Intell. Statist., 661
Chentanez, Intrinsically motivated reinforcement learning, Proc. Adv. Neural Inf. Process. Syst., 1281
Burda, 2018, Large-scale study of curiosity-driven learning, arXiv:1808.04355
Shalev-Shwartz, 2016, Safe, multi-agent, reinforcement learning for autonomous driving, arXiv:1610.03295
Xiong, 2016, Combining deep reinforcement learning and safety based control for autonomous driving, arXiv:1612.00147
García, 2015, A comprehensive survey on safe reinforcement learning, J. Mach. Learn. Res., 16, 1437
Dhariwal, 2017, OpenAI Baselines
Juliani, 2018, Unity: A general platform for intelligent agents, arXiv:1809.02627
Guadarrama, 2018, TF-Agents: A Library for Reinforcement Learning in Tensorflow
Stooke, 2019, Rlpyt: A research code base for deep reinforcement learning in PyTorch, arXiv:1909.01500
Osband, 2019, Behaviour suite for reinforcement learning, arXiv:1908.03568