Autonomous maneuver strategy of swarm air combat based on DDPG
Tóm tắt
Unmanned aerial vehicles (UAVs) have been found significantly important in the air combats, where intelligent and swarms of UAVs will be able to tackle with the tasks of high complexity and dynamics. The key to empower the UAVs with such capability is the autonomous maneuver decision making. In this paper, an autonomous maneuver strategy of UAV swarms in beyond visual range air combat based on reinforcement learning is proposed. First, based on the process of air combat and the constraints of the swarm, the motion model of UAV and the multi-to-one air combat model are established. Second, a two-stage maneuver strategy based on air combat principles is designed which include inter-vehicle collaboration and target-vehicle confrontation. Then, a swarm air combat algorithm based on deep deterministic policy gradient strategy (DDPG) is proposed for online strategy training. Finally, the effectiveness of the proposed algorithm is validated by multi-scene simulations. The results show that the algorithm is suitable for UAV swarms of different scales.
Tài liệu tham khảo
Y. Li, X. Qiu, X. Liu, Q. Xia, Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of ucavs. J. Syst. Eng. Electron.31(4), 734–742 (2020).
D. Hu, R. Yang, J. Zuo, Z. Zhang, Y. Wang, Application of deep reinforcement learning in maneuver planning of beyond-visual-range air combat. IEEE Access. PP(99), 1–1 (2021).
A. Xu, X. Chen, Z. W. Li, X. D. Hu, A method of situation assessment for beyond-visual-range air combat based on tactical attack area. Fire Control Command Control. 45(9), 97–102 (2020).
Z. H. Hu, Y. Lv, A. Xu, A threat assessment method for beyond-visual-range air combat based on situation prediction. Electron. Opt. Control. 27(3), 8–1226 (2020).
W. H. Wu, S. Y. Zhou, L. Gao, J. T. Liu, Improvements of situation assessment for beyond-visual-range air combat based on missile launching envelope analysis. Syst. Eng. Electron.33(12), 2679–2685 (2011).
H. Luo, Target detection method in short coherent integration time for sky wave over-the-horizon radar. Sadhana. 45(1) (2020).
T. Liu, R. W. Mei, in Proceedings of 2019 International Conference on Computer Science, Communications and Multimedia Engineering (CSCME 2019), Shanghai, China. Over-the-horizon radar impulsive interference detection with pseudo-music algorithm, (2019). Computer Science and Engineering (ISSN 2475-8841).
H. Wu, H. Li, R. Xiao, J. Liu, Modeling and simulation of dynamic ant colony’s labor division for task allocation of uav swarm. Phys. A Stat. Mech. Appl., 0378437117308166 (2017). https://doi.org/10.1016/j.physa.2017.08.094.
F. Austin, G. Carbone, H. Hinz, M. Lewis, M. Falco, Game theory for automated maneuvering during air-to-air combat. J. Guid. Control Dyn.13(6), 1143–1149 (1990).
J. S. Ha, H. J. Chae, H. L. Choi, A stochastic game-theoretic approach for analysis of multiple cooperative air combat. Am. Autom. Control Counc., 3728–3733 (2015). https://doi.org/10.1109/acc.2015.7171909.
R. P. Wang, Z. H. Gao, Research on decision system in air combat simulation using maneuver library. Flight Dyn.27(6), 72–75 (2009).
V. Kai, T. Raivio, R. P. Hmlinen, Modeling pilot’s sequential maneuvering decisions by a multistage influence diagram. J. Guidance Control Dyn.27(4), 665–677 (2004).
K. Virtanen, J. Karelahti, T. Raivio, Modeling air combat by a moving horizon influence diagram game. J. Guidance Control Dyn.29(5), 5 (2004).
H. Ehtamo, T. Raivio, On applied nonlinear and bilevel programming or pursuit-evasion games. J. Optim. Theory Appl.108(1), 65–96 (2001).
L. Zhong, M. Tong, W. Zhong, Application of multistage influence diagram game theory for multiple cooperative air combat. J. Beijing Univ. Aeronaut. Astronaut.33(4), 450–453 (2007).
Z. Liu, A. Liang, C. Jiang, Q. X. Wu, Application of multistage influence diagram in maneuver decision-making of ucav cooperative combat. Electron. Opt. Control. 33(4), 450–453 (2010).
J. Kaneshige, K. Krishnakumar, in Proceedings of SPIE - The International Society for Optical Engineering, 6560:656009. Artificial immune system approach for air combat maneuvering, (2007).
N. Ernest, D. Carroll, C. Schumacher, M. Clark, G. Lee, Genetic fuzzy based artificial intelligence for unmanned combat aerialvehicle control in simulated air combat missions. J. Defense Manag.06(1) (2016).
N. Ernest, D. Carroll, C. Schumacher, M. Clark, G. Lee, Genetic fuzzy based artificial intelligence for unmanned combat aerialvehicle control in simulated air combat missions. J. Defense Manag.06(1), 1–7 (2016).
L. Fallati, A. Polidori, C. Salvatore, L. Saponari, A. Savini, P. Galli, Anthropogenic marine debris assessment with unmanned aerial vehicle imagery and deep learning: A case study along the beaches of the republic of maldives. Sci. Total Environ.693:, 133581 (2019).
B. Neupane, T. Horanont, N. D. Hung, Deep learning based banana plant detection and counting using high-resolution red-green-blue (rgb) images collected from unmanned aerial vehicle (uav). PLoS ONE. 14(10), 0223906 (2019).
Z. Jiao, C. G. Jia, C. Y. Cai, A new approach to oil spill detection that combines deep learning with unmanned aerial vehicles. Comput. Ind. Eng.135:, 1300–1311 (2018).
X. Zhao, Y. Yuan, M. Song, Y. Ding, F. Lin, D. Liang, D. Zhang, Use of unmanned aerial vehicle imagery and deep learning unet to extract rice lodging. Sensors (Basel, Switzerland). 19(18) (2019). https://doi.org/10.3390/s19183859.
C. Qu, W. Gai, M. Zhong, J. Zhang, A novel reinforcement learning based grey wolf optimizer algorithm for unmanned aerial vehicles (uavs) path planning. Appl. Soft Comput. J.89:, 106099 (2020).
Z. X, Q. Zong, B. Tian, B. Zhang, M. You, Fast task allocation for heterogeneous unmanned aerial vehicles through reinforcement learning. Aerosp. Sci. Technol.92: (2019). https://doi.org/10.1016/j.ast.2019.06.024.
J. Yang, Y. X, G. Wu, M. M. Hassan, A. Almogren, J. Guna, Application of reinforcement learning in uav cluster task scheduling. Futur. Gener. Comput. Syst.95:, 140–148 (2019).
S. D, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, Mastering the game of go with deep neural networks and tree search. Nature. 529(7587), 484–489 (2016).
D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, D. Hassabis, Mastering the game of go without human knowledge. Nature. 550(7676), 354–359 (2017).
Y. Ma, W. Zhu, M. G. Benton, J. Romagnoli, Continuous control of a polymerization system with deep reinforcement learning. J. Process Control. 75:, 40–47 (2019).
Q. Zhang, R. Yang, L. X. Yu, T. Zhang, Z. J, Bvr air combat maneuvering decision by using q-network reinforcement learning. J. Air Force Eng. Univ. (Nat. Sci. Ed.)19(6), 8–14 (2018).
C. U. Chithapuram, A. K. Cherukuri, Y. V. Jeppu, Aerial vehicle guidance based on passive machine learning technique. Int. J. Intell. Comput. Cybern.9(3), 255–273 (2016).
X. Zhang, G. Liu, C. Yang, W. Jiang, Research on air combat maneuver decision-making method based on reinforcement learning. Electronics. 7(11), 279 (2018).
B. Kurniawan, P. Vamplew, M. Papasimeon, R. Dazeley, C. Foale, in AI 2019: Advances in Artificial Intelligence, 32nd Australasian Joint Conference, Adelaide, SA, Australia, December 2–5, 2019, Proceedings. An empirical study of reward structures for actor-critic reinforcement learning in air combatmanoeuvring simulation (Springer, 2019), pp. 2–5.
Q. Yang, J. Zhang, G. Shi, J. Hu, Y. Wu, Maneuver decision of uav in short-range air combat based on deep reinforcement learning. IEEE Access. PP(99), 1–1 (2019).
Q. Yang, Y. Zhu, J. Zhang, S. Qiao, J. Liu, in 2019 IEEE 15th International Conference on Control and Automation (ICCA). Uav air combat autonomous maneuver decision based on ddpg algorithm, (2019), pp. 16–19. https://doi.org/10.1109/icca.2019.8899703.
H. C. Tien, A. Battad, E. A. Bryce, J. Fuller, A. Simor, Multi-drug resistant acinetobacter infections in critically injured canadian forces soldiers. BMC Infect. Dis.7(1), 1–6 (2007).
R. Z. Xie, J. Y. Li, D. L. Luo, in 2014 11th IEEE International Conference on Control and Automation (ICCA). Research on maneuvering decisions for multi-uavs air combat (IEEE, 2014).
M. Volodymyr, K. Koray, S. David, A. A. Rusu, V. Joel, M. G. Bellemare, G. Alex, R. Martin, A. K. Fidjeland, O. Georg, Human-level control through deep reinforcement learning. Nature. 518(7540), 529–33 (2019).
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous control with deep reinforcement learning. Comput. ence. 8(6), 187–200 (2015).