Deep Reinforcement Learning Controller for 3D Path Following and Collision Avoidance by Autonomous Underwater Vehicles

Simen Theie Havenstrøm1, Adil Rasheed1,2, Omer San3
1Department of Engineering Cybernetics, Norwegian University of Science and Technology, Trondheim, Norway
2Mathematics and Cybernetics, SINTEF Digital, Trondheim, Norway
3School of Mechanical and Aerospace Engineering, Oklahoma State University, Stillwater, OK, United States

Tóm tắt

Control theory provides engineers with a multitude of tools to design controllers that manipulate the closed-loop behavior and stability of dynamical systems. These methods rely heavily on insights into the mathematical model governing the physical system. However, in complex systems, such as autonomous underwater vehicles performing the dual objective of path following and collision avoidance, decision making becomes nontrivial. We propose a solution using state-of-the-art Deep Reinforcement Learning (DRL) techniques to develop autonomous agents capable of achieving this hybrid objective without having a priori knowledge about the goal or the environment. Our results demonstrate the viability of DRL in path following and avoiding collisions towards achieving human-level decision making in autonomous vehicle systems within extreme obstacle configurations.

Từ khóa


Tài liệu tham khảo

Ataei, 2015, Three-dimensional optimal path planning for waypoint guidance of an autonomous underwater vehicle, Robot. Autonom. Syst., 67, 23, 10.1016/j.robot.2014.10.007

Bengio, 2009, Curriculum learning, Proceedings of the 26th annual international conference on machine learning., 41, 10.1145/1553374.1553380

Breivik, 2009, Guidance laws for autonomous underwater vehicles, Underwater vehicles, 10.5772/6696

2016, OpenAI Gym

Carlucho, 2018, Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning, Robot. Autonom. Syst., 107, 71, 10.1016/j.robot.2018.05.016

Carroll, 1992, Auv path planning: an a* approach to path planning with consideration of variable vehicle speeds and multiple, overlapping, time-dependent exclusion zones, Proceedings of the 1992 symposium on autonomous underwater vehicle technology., 79, 10.1109/AUV.1992.225191

Cashmore, 2014, Auv mission control via temporal planning, 6535

Chang, 2015, Curvature-continuous 3d path-planning using qpmi method, Int. J. Adv. Rob. Syst., 12, 76, 10.5772/60718

Chu, 2015, 3d path-following control for autonomous underwater vehicle based on adaptive backstepping sliding mode, 1143

Cirillo, 2017, From videogames to autonomous trucks: a new algorithm for lattice-based motion planning, 148

da Silva, 2007, “Modeling and simulation of the LAUV autonomous underwater vehicle

Dhariwal, 2017, Openai baselines, GitHub repository

Encarnacao, 2000, 3d path following for autonomous underwater vehicle, Proceedings of the 39th IEEE conference on decision and control (cat. No.00CH37187), 3, 2977, 10.1109/CDC.2000.914272

Eriksen, 2016, A modified dynamic window algorithm for horizontal collision avoidance for auvs, 499

Fossen, 2011, Handbook of Marine Craft Hydrodynamics and Motion Control, 10.1002/9781119994138

Fox, 1997, The dynamic window approach to collision avoidance, IEEE Robot. Autom. Mag., 4, 23, 10.1109/100.580977

Garau, 2005, Path planning of autonomous underwater vehicles in current fields with complex spatial variability: an a* approach, 194

Haugen, 2008, Derivation of a Discrete-Time Lowpass Filter (TechTeach)

Karaman, 2011, Sampling-based algorithms for optimal motion planning, Int. J. Robot Res., 30, 846, 10.1177/02F0278364911406761

Kavraki, 1996, Probabilistic roadmaps for path planning in high-dimensional configuration spaces, IEEE Trans. Robot. Autom., 12, 566, 10.1109/70.508439

Liang, 2018, Three-dimensional path following of an underactuated auv based on fuzzy backstepping sliding mode control, Int. J. Fuzzy Syst., 20, 640, 10.1007/s40815-017-0386-y

Lillicrap, 2015, Continuous control with deep reinforcement learning, CoRR abs, 1509, 02971

Ljungqvist, 2019, A path planning and path-following control framework for a general 2-trailer with a car-like tractor, J. Field Robot., 36, 1345, 10.1002/rob.21908

Martinsen, , Curved path following with deep reinforcement learning: results from three vessel models, 1

Martinsen, , Straight-path following for underactuated marine vessels using deep reinforcement learning, IFAC-PapersOnLine, 51, 329, 10.1016/j.ifacol.2018.09.502

McGann, 2008, A deliberative architecture for auv control, 1049

Meyer, , COLREG-compliant collision avoidance for unmanned surface vehicle using deep reinforcement learning, IEEE Access, 8, 165344, 10.1109/ACCESS.2020.3022600

Meyer, , Taming an autonomous surface vehicle for path following and collision avoidance using deep reinforcement learning, IEEE Access, 8, 41466, 10.1109/ACCESS.2020.2976586

Nielsen, 2015, Neural networks and deep learning

Pivtoraiko, 2009, Differentially constrained mobile robot motion planning in state lattices, J. Field Robot., 26, 308, 10.1002/rob

Schulman, 2015, High-Dimensional Continuous Control Using Generalized Advantage Estimation

Schulman, 2017, Proximal policy optimization algorithms, CoRR abs, 1707, 06347

Sugihara, 1996, Ga-based motion planning for underwater robotic vehicles, 406

Tan, 2006, A Collision avoidance system for autonomous underwater vehicles

Vaddireddy, 2020, Feature engineering and symbolic regression methods for detecting hidden physics from sparse sensors, Phys. Fluids, 32, 015113, 10.1063/1.5136351

Wiig, 2018, A 3d reactive collision avoidance algorithm for nonholonomic vehicles, 67

Williams, 1990, A collision avoidance controller for autonomous underwater vehicles, 206

Woo, 2019, Deep reinforcement learning-based controller for path following of an unmanned surface vehicle, Ocean Engineering, 183, 155, 10.1016/j.oceaneng.2019.04.099

Xiang, 2017, Robust fuzzy 3D path following for autonomous underwater vehicle subject to uncertainties, Comput. Oper. Res., 84, 165, 10.1016/j.cor.2016.09.017

Yann LeCun, 1998, Efficient BackProp

Yu, 2017, Deep reinforcement learning based optimal trajectory tracking control of autonomous underwater vehicle, 4958