Data-based Optimal Control for Discrete-time Zero-sum Games of 2-D Systems Using Adaptive Critic Designs

Acta Automatica Sinica - Tập 35 - Trang 682-692 - 2009
Qing-Lai WEI1, Hua-Guang ZHANG2, Li-Li CUI2
1The Key Laboratory of Complex Systems and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, P.R. China
2The School of Information Science and Engineering, Northeastern University, Shenyang 110004, P. R. China

Tài liệu tham khảo

Jamshidi, 1982 Chang, 2003, Two-person zero-sum Markov games: receding horizon approach, IEEE Transactions on Automatic Control, 48, 1951, 10.1109/TAC.2003.819077 Chen, 2002, Fuzzy differential games for nonlinear stochastic systems: suboptimal approach, IEEE Transactions on Fuzzy Systems, 10, 222, 10.1109/91.995123 Nian, 2006, Design of optimal observer and optimal feedback controller based on differential game theory, Acta Automatica Sinica, 32, 807 Nian, 2005, Suboptimal strategies of linear quadratic closed-loop differential games: a BMI approach, Acta Automatica Sinica, 31, 216 Bertsekas, 2003 Goebel, 2001, Convexity in zero-sum differential games, SIAM Journal of Control and Optimization, 40, 1491, 10.1137/S0363012999360737 Altman, 1998, Multiuser rate-based flow control, IEEE Transactions on Communications, 46, 940, 10.1109/26.701322 Basar, 1982 Basar, 1995 Hua, 1994, Linear-quadratic zero-sum differential games for generalized state space systems, IEEE Transactions on Automatic Control, 39, 143, 10.1109/9.273352 Wei, 2009, Robust H∞ control for discretetime fuzzy systems with infinite-distributed delays, IEEE Transactions on Fuzzy Systems, 17, 224, 10.1109/TFUZZ.2008.2006621 Werbos, 1992, Approximate dynamic programming for realtime control and neural modeling Xu, 2008, H∞ control for 2-D discrete state delayed systems in the second FM model, Acta Automatica Sinica, 34, 809, 10.3724/SP.J.1004.2008.00809 Uetake, 1992, Optimal smoothing for noncausal 2-D systems based on a descriptor model, IEEE Transactions on Automatic Control, 37, 1840, 10.1109/9.173164 Owens, 2000, Analysis of linear iterative learning control schemes — a 2D systems/repetitive processes approach, Multidimensional Systems and Signal Processing, 11, 125, 10.1023/A:1008494815252 Sulikowski, 2004, Output feedback control of discrete linear repetitive processes, Automatica, 40, 2167 Li, 1991, Optimal control of 2-D systems, IEEE Transactions on Automatic Control, 36, 223, 10.1109/9.67300 Liu, 2008, Adaptive critic learning techniques for engine torque and air-fuel ratio control, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38, 988, 10.1109/TSMCB.2008.922019 Al-Tamimi, 2007, Adaptive critic designs for discrete-time zero-sum games with application to H∞ control, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 37, 240, 10.1109/TSMCB.2006.880135 Liu, 2005, Approximate dynamic programming for selflearning control, Acta Automatica Sinica, 31, 13 Ray, 2008, Comparison of adaptive critic-based and classical wide-area controllers for power systems, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38, 1002, 10.1109/TSMCB.2008.924141 Watkins C. Learning from Delayed Rewards [Ph. D. dissertation], Cambridge University, USA, 1989 Werbos, 1991, A menu of designs for reinforcement learning over time, 67 Widrow, 1973, Punish/reward: learning with a critic in adaptive threshold systems, IEEE Transactions on Systems, Man, Cybernetics, 3, 455, 10.1109/TSMC.1973.4309272 Prokhorov, 1997, Adaptive critic designs, IEEE Transactions on Neural Networks, 8, 997, 10.1109/72.623201 Murray, 2002, Adaptive dynamic programming, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 32, 140, 10.1109/TSMCC.2002.801727 Zhang, 2008, A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm, IEEE Transactions on Systems, Man, Cybernetics, Part B: Cybernetics, 38, 937, 10.1109/TSMCB.2008.920269 Liu, 2005, A neural dynamic programming approach for learning control of failure avoidance problems, International Journal of Intelligent Control and Systems, 10, 21 Liu, 2005, A self-learning call admission control scheme for CDMA cellular networks, IEEE Transactions on Neural Networks, 16, 1219, 10.1109/TNN.2005.853408 Al-Tamimi, 2008, Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38, 943, 10.1109/TSMCB.2008.926614 Chen, 2008, Generalized Hamilton-Jacobi-Bellman formulation-based neural network control of affine nonlinear discretetime systems, IEEE Transactions on Neural Networks, 19, 90, 10.1109/TNN.2007.900227 Ferrari, 2008, Adaptive feedback control by constrained approximate dynamic programming, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38, 982, 10.1109/TSMCB.2008.924140 Balakrishnan, 2008, Issues on stability of ADP feedback controllers for dynamical systems, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38, 913, 10.1109/TSMCB.2008.926599 Seiffertt, 2008, Hamilton-Jacobi-Bellman equations and approximate dynamic programming on time scales, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38, 918, 10.1109/TSMCB.2008.923532 Al-Tamimi, 2007, Model-free Qlearning designs for linear discrete-time zero-sum games with application to H-infinity control, Automatica, 43, 473, 10.1016/j.automatica.2006.09.019 Luenberger, 1969 Zhang, 2004 Si, 2001, On-line learning control by association and reinforcement, IEEE Transactions on Neural Networks, 12, 264, 10.1109/72.914523 Tsai, 2002, Discretized quadratic optimal control for continuous-time two-dimensional systems, IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 49, 116, 10.1109/81.974886