Adaptive dynamic programming for online solution of a zero-sum differential game
Tóm tắt
Từ khóa
Tài liệu tham khảo
T. Basar, P. Bernhard. H∞ Optimal Control and Related Minimax Design Problems. Boston: Birkhuser, 1995.
T. Basar, G. J. Olsder. Dynamic Noncooperative Game Theory (Classics in Applied Mathematics 23). 2nd ed. Philadelphia: SIAM, 1999.
J. Doyle, K. Glover, P. Khargonekar, et al. State-space solutions to standard H2 and H∞ control problems. IEEE Transactions on Automatic Control, 1989, 34(8): 831–847
A. A. Stoorvogel. The H∞ Control Problem: A State Space Approach. New York: Prentice Hall, 1992.
K. Zhou, P. P. Khargonekar. An algebraic Riccati equation approach to H∞ optimization. Systems & Control Letters, 1988, 11(2): 85–91.
L. Cherfi, H. Abou-Kandil, H. Bourles. Iterative method for general algebraic Riccati equation. Proceedings of International Conference on Automatic Control and System Engineering, Cairo, Egypt, 2005: 85–88.
T. Damm. Rational Matrix Equations in Stochastic Control. Berlin: Springer-Verlag, 2004.
A. Lanzon, Y. Feng, B. D. O. Anderson, et al. Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method. IEEE Transactions on Automation Control, 2008, 53(10): 2280–2291.
M. Abu-Khalaf, F. L. Lewis, J. Huang. Policy iterations and the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation. IEEE Transactions on Automatic Control, 2006, 51(12): 1989–1995.
Y. Feng, B. D. O. Anderson, M. Rotkowitz. A game theoretic algorithm to compute local stabilizing solutions to HJBI equations in nonlinear H∞ control. Automatica, 2009, 45(4): 881–888.
A. J. van der Schaft. L 2-gain analysis of nonlinear systems and nonlinear state feedback H∞ control. IEEE Transactions on Automatic Control, 1992, 37(6): 770–784.
R. Sutton. Learning to predict by the method of temporal differences. Machine Learning, 1988, 3(1): 9–44.
P. J. Werbos. Approximate dynamic programming for real-time control and neural modeling. D. White, D. Sofge, eds. Handbook of Intelligent Control, Neural, Fuzzy, and, Adaptive Approaches, New York: Van Nostrand, 1992: 493–525.
C. Watkins. Learning from Delayed Rewards. Ph.D. thesis. Cambridge, U.K.: Cambridge University, 1989.
Q. Wei, H. Zhang. A new approach to solve a class of continuous-time nonlinear quadratic zero-sum game using ADP. Proceedings of IEEE International Conference on Networking, Sensing and Control, New York: IEEE, 2008: 507–512.
D. Vrabie, M. Abu-Khalaf, F. L. Lewis, et al. Continuoustime ADP for linear systems with partially unknown dynamics. Proceedings of Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL), New York: IEEE, 2007: 247–253.
D. Vrabie, O. Pastravanu, F. L. Lewis, et al. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 2009, 45(2): 477–484.
F. L. Lewis, V. L. Syrmos. Optimal Control. New York: John Wiley & Sons, 1995.
J. W. Brewer. Kronecker products and matrix calculus in system theory. IEEE Transactions on Circuit and System, 1978, 25(9): 772–781.