Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning

Robotics and Autonomous Systems - Tập 107 - Trang 71-86 - 2018
Ignacio Carlucho1,2, Mariano De Paula1, Sen Wang2, Yvan Pétillot2, Gerardo G. Acosta1
1INTELYMEC group, Centro de Investigaciones en Física e Ingeniería del Centro CIFICEN – UNICEN – CICpBA – CONICET, Argentina
2School of Engineering & Physical Sciences, Heriot-Watt University, EH14 4AS Edinburgh, UK

Tóm tắt

Từ khóa


Tài liệu tham khảo

Chyba, 2009, Autonomous underwater vehicles, Ocean Eng., 36, 1, 10.1016/j.oceaneng.2008.12.005

Rozenfeld, 2010, A guidance and control system proposal for autonomous pipeline inspections, Trans. Syst. Signals Devices, 5, 5

Alam, 2014, Design and construction of an autonomous underwater vehicle, Neurocomputing, 142, 16, 10.1016/j.neucom.2013.12.055

Knudson, 2011, Adaptive navigation for autonomous robots, Robot. Auton. Syst., 59, 410, 10.1016/j.robot.2011.02.004

Gafurov, 2015, Autonomous unmanned underwater vehicles development tendencies, Procedia Eng., 106, 141, 10.1016/j.proeng.2015.06.017

Jalving, 1994, The NDRE-AUV flight control system, IEEE J. Ocean. Eng., 19, 497, 10.1109/48.338385

Fossen, 1995, Robust adaptive control of underwater vehicles: a comparative study, IFAC Proc., 28, 66, 10.1016/S1474-6670(17)51653-5

Valenciaga, 2007, Trajectory tracking of the cormoran AUV based on a PI-MIMO approach, 1

Sutarto, 2011, Development of linear parameter varying control system for autonomous underwater vehicle, Indian J. Geo-Marine Sci., 40, 275

Sarhadi, 2016, Model reference adaptive PID control with anti-windup compensator for an autonomous underwater vehicle, Robot. Auton. Syst., 83, 87, 10.1016/j.robot.2016.05.016

Ferreira, 2009, Control of the MARES autonomous underwater vehicle

Lapierre, 2008, Robust nonlinear path-following control of an AUV, IEEE J. Ocean. Eng., 33, 89, 10.1109/JOE.2008.923554

Wadoo, 2012, Optimal control of an autonomous underwater vehicle, 1

Geranmehr, 2015, Nonlinear suboptimal control of fully coupled non-affine six-DOF autonomous underwater vehicle using the state-dependent Riccati equation, Ocean Eng., 96, 248, 10.1016/j.oceaneng.2014.12.032

Fischer, 2014, Nonlinear RISE-based control of an autonomous underwater vehicle, IEEE Trans. Robot., 30, 845, 10.1109/TRO.2014.2305791

Antonelli, 2007, On the use of adaptive/integral actions for six-degrees-of-freedom control of autonomous underwater vehicles, IEEE J. Ocean. Eng., 32, 300, 10.1109/JOE.2007.893685

Antonelli, 2001, Adaptive control of an autonomous underwater vehicle: experimental results on ODIN, IEEE Trans. Control Syst. Technol., 9, 756, 10.1109/87.944470

Barbalata, 2015, An adaptive controller for autonomous underwater vehicles, 1658

Rout, 2017, Inverse optimal self-tuning PID control design for an autonomous underwater vehicle, Int. J. Syst. Sci., 48, 367, 10.1080/00207721.2016.1186238

M. Narasimhan, S.N. Singh, Adaptive optimal control of an autonomous underwater vehicle in the dive plane using dorsal fins, 33 (2006) 404–416. http://dx.doi.org/10.1016/j.oceaneng.2005.04.017.

van de Ven, 2005, Neural network control of underwater vehicles, Eng. Appl. Artif. Intell., 18, 533, 10.1016/j.engappai.2004.12.004

Shi, 2007, Adaptive depth control for autonomous underwater vehicles based on feedforward neural networks, Intell. Control Autom., 4, 207

Zhu, 2013, The bio-inspired model based hybrid sliding-mode tracking control for unmanned underwater vehicles, Eng. Appl. Artif. Intell., 26, 2260, 10.1016/j.engappai.2013.08.017

Smith, 1994, Fuzzy logic control of an autonomous underwater vehicle, Control Eng. Pract., 2, 321, 10.1016/0967-0661(94)90214-3

DeBitetto, 1995, Fuzzy logic for depth control of unmanned undersea vehicles, 233

Guo, 2003, Design of a sliding mode fuzzy controller for the guidance and control of an autonomous underwater vehicle, Ocean Eng., 30, 2137, 10.1016/S0029-8018(03)00048-9

Smith, 1993, Applications of fuzzy logic to the control of an autonomous underwater vehicle, 1099

Zadeh, 1994, Fuzzy logic neural networks and soft computing, Commun. ACM, 37, 77, 10.1145/175247.175255

Raeisy, 2012, Optimized fuzzy control design of an autonomous underwater vehicle, Iran. J. Fuzzy Syst., 9, 25

Khodayari, 2015, Modeling and control of autonomous underwater vehicle (AUV) in heading and depth attitude via self-adaptive fuzzy PID controller, J. Mar. Sci. Technol., 20, 559, 10.1007/s00773-015-0312-7

Londhe, 2017, Robust task-space control of an autonomous underwater vehicle-manipulator system by PID-like fuzzy control scheme with disturbance estimator, Ocean Eng., 139, 1, 10.1016/j.oceaneng.2017.04.030

Sutton, 1998

Kober, 2013, Reinforcement learning in robotics: a survey, Int. J. Robot. Res., 579

Parhi, 2012, Review on guidance control and navigation of autonomous underwater mobile robot, Int. J. Artif. Intell. Comput. Res., 4, 21

Dayan, 2014, Model-based and model-free pavlovian reward learning: revaluation, revision and revelation, Cogn. Affect Behav. Neurosci., 14, 473, 10.3758/s13415-014-0277-8

Y. Chebotar, K. Hausman, M. Zhang, G. Sukhatme, S. Schaal, S. Levine, Combining model-based and model-free updates for trajectory-centric reinforcement learning, 2017, http://arxiv.org/abs/1703.03078. (Accessed 19 September 2017).

Watkins, 1992, Q-learning, Mach. Learn., 8, 279, 10.1007/BF00992698

T. Hester, M. Quinlan, P. Stone, RTMBA: A real-time model-based reinforcement learning architecture for robot control, in: 2012: pp. 85–90. http://dx.doi.org/10.1109/ICRA.2012.6225072.

C. Gaskett, D. Wettergreen, A. Zelinsky, Reinforcement learning applied to the control of an autonomous underwater vehicle, in: 1999: pp. 125–131. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.8469&rep=rep1&type=pdf..

Carreras, 2005, A behavior-based scheme using reinforcement learning for autonomous underwater vehicles, IEEE J. Ocean. Eng., 30, 416, 10.1109/JOE.2004.835805

A. El-Fakdi, M. Carreras, N. Palomeras, P. Ridao, Autonomous underwater vehicle control using reinforcement learning policy search methods, in: 2005: p. 793–798 Vol. 2. http://dx.doi.org/10.1109/OCEANSE.2005.1513157.

A. El-Fakdi, Gradient-based reinforcement learning techniques for underwater robotics behavior learning, 2010. http://tesisenred.net/handle/10803/7610. (Accessed 26 April 2012).

El-Fakdi, 2013, Two-step gradient-based reinforcement learning for underwater robotics behavior learning, Robot. Auton. Syst., 61, 271, 10.1016/j.robot.2012.11.009

Blekas, 2018, Rl-based path planning for an over-actuated floating vehicle under disturbances, Robot. Auton. Syst., 101, 93, 10.1016/j.robot.2017.12.009

Frost, 2014, Evaluation of Q-learning for search and inspect missions using underwater vehicles, 1

Frost, 2015, Reinforcement learning in a behaviour-based control architecture for marine archaeology, 1

Cui, 2017, Adaptive neural network control of auvs with control input nonlinearities using reinforcement learning, IEEE Trans. Syst. Man, Cybern. Syst., 47, 1019, 10.1109/TSMC.2016.2645699

Riedmiller, 2005, 317

Degris, 2012, Model-free reinforcement learning with continuous action in practice, 2177

LeCun, 2015, Deep learning, Nature, 521, 436, 10.1038/nature14539

A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in: Proceeding NIPS’12 Proc. 25th Int. Conf. Neural Inf. Process. Syst., Vol. 25, 2012, pp. 1097–1105. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.299.205. (Accessed 22 September 2017).

Mnih, 2015, Human-level control through deep reinforcement learning, Nature, 518, 529, 10.1038/nature14236

T.P. Lillicrap, J.J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous control with deep reinforcement learning, in: ICLR 2016, 2016 pp. 1–14. http://dx.doi.org/10.1561/2200000006.

D. Silver, N. Heess, T. Degris, D. Wierstra, M. Riedmiller, Deterministic Policy Gradient Algorithms, in: 31st Int. Conf. Mach. Learn. 2014. http://proceedings.mlr.press/v32/silver14.pdf. (Accessed 22 September 2017).

S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, in: 32nd Int. Conf. Mach. Learn. 2015. http://dx.doi.org/10.1007/s13398-014-0173-7.2.

A. Yu, R. Palefsky-Smith, R. Bedi, Deep reinforcement learning for simulated autonomous vehicle control, 2016.

A. Ganesh, J. Charalel, M. Das Sarma, N. Xu, Deep reinforcement learning for simulated autonomous driving, 2016, http://cs229.stanford.edu/proj2016/report/Ganesh-Charalel-DasSarma-Xu-DeepReinforcementLearningForSimulatedAutonomousDriving-report.pdf.(Accessed 21 September 2017).

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, TensorFlow: large-scale machine learning on heterogeneous distributed systems, 2015, http://arxiv.org/abs/1603.04467. (Accessed 21 September 2017).

F. Chollet, Keras: Deep learning library for Theano and TensorFlow, Https://keras.io/. 2016, https://keras.io/.

Loiacono, 2010, The 2009 simulated car racing championship, IEEE Trans. Comput. Intell. AI Games, 2, 131, 10.1109/TCIAIG.2010.2050590

El Sallab, 2017, Deep reinforcement learning framework for autonomous driving, Soc. Imaging Sci. Technol., 7, 70

R. Yu, Z. Shi, C. Huang, T. Li, Q. Ma, Deep reinforcement learning based optimal trajectory tracking control of autonomous underwater vehicle, in: 36th Chinese Control Conf., 2017, pp. 4958–4965.

D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, in: 3rd Int. Conf. Learn. Represent. San Diego, 2015, http://arxiv.org/abs/1412.6980. (Accessed 21 September 2017).

Serre, 2007, A quantitative theory of immediate visual recognition, Prog. Brain Res., 33, 10.1016/S0079-6123(06)65004-8

Goodfellow, 2017

J.T. Springenberg, A. Dosovitskiy, T. Brox, M. Riedmiller, Striving for simplicity: the all convolutional net, ICLR 2015, 2015, http://arxiv.org/abs/1412.6806. (Accessed 10 October 2017).

R. Arora, A. Basu, P. Mianjy, A. Mukherjee, Understanding deep neural networks with rectified linear units, in: ICLR 2018, 2018, pp. 1–17 http://arxiv.org/abs/1611.01491.

Jiang, 2018, Deep neural networks with elastic rectified linear units for object recognition, Neurocomputing, 275, 1132, 10.1016/j.neucom.2017.09.056

S. Lange, T. Gabel, M. Riedmiller, Batch reinforcement learning, 2012, http://dx.doi.org/10.1007/978-3-642-27645-3.

N. Valeyrie, F. Maurelli, P. Patron, J. Cartwright, B. Davis, Y. Petillot, Nessie v turbo: A new hover and power slide capable torpedo shaped auv for survey, inspection and intervention, in: AUVSI North Am. 2010 Conf. 2010.

Barbalata, 2014, Dynamic coupling and control issues for a lightweight underwater vehicle manipulator system, 1

A. Lammas, K. Sammut, F. He, 6-DoF navigation systems for autonomous underwater vehicles, in: Mob. Robot. Navig. InTech, 2010. http://dx.doi.org/10.5772/8978.