Learning control of failure avoidance problems with known analytical form of cost function

Derong Liu1
1Department of Electrical and Computer Engineering, University of Illinois, Chicago, IL, USA

Tóm tắt

We study adaptive critic designs for a class of failure avoidance control problems. We categorize such problems by the choice of local cost function as zero throughout a trial except at the last time step when a failure occurs. We derive an analytical form of its overall cost function defined as the infinite summation of the local cost function over time. We demonstrate that the outputs of the critic network after learning resemble well the analytically derived cost function.

Từ khóa

#Failure analysis #Cost function #Neural networks #Programmable control #Adaptive control #Dynamic programming #Optimal control #Signal design #Nonlinear control systems #Control systems

Tài liệu tham khảo

demuth, 1998, Neural Network Toolbox User's Guide 10.1109/72.329697 haykin, 1999, Neural Networks A Comprehensive Foundation 10.1109/ICNN.1997.616109 lewis, 1995, Optimal Control liu, 2001, Action-dependent adaptive critic designs, Proceedings of the INNS-IEEE International Joint Conference on Neural Networks, 990 prokhorov, 1997, Adaptive Critic Designs and Their Applications 10.1016/0893-6080(95)00042-9 10.1109/72.623201 10.1109/ICNN.1997.614399 barto, 1992, Reinforcement learning and adaptive critic methods, Handbook of Intelligent Control Neural Fuzzy and Adaptive Approaches werbos, 1992, Approximate dynamic programming for real-time control and neural modeling, Handbook of Intelligent Control Neural Fuzzy and Adaptive Approaches 10.2514/3.21715 bellman, 1957, Dynamic Programming 10.1109/TSMC.1983.6313077 10.1002/(SICI)1099-1239(19991215)9:14<1071::AID-RNC453>3.0.CO;2-W bertsekas, 1996, Neuro-Dynamic Programming anderson, 1990, Challenging control problems, Neural Networks for Control (Appendix A) 10.1016/0895-7177(95)00226-X 10.1109/37.24809 santiago, 1994, New progress towards truly brain-like intelligent control, Proceedings of the World Congress on Neural Networks, 1, 27 10.1016/0022-0000(92)90039-L 10.1109/72.914523 10.1109/TSMC.1987.289329 10.1007/BF00115009 werbos, 1990, A menu of designs for reinforcement learning over time, Neural Networks for Control 10.1016/0893-6080(90)90088-3