A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion

Journal of Optimization Theory and Applications - Tập 163 - Trang 674-684 - 2013
Rolando Cavazos-Cadena1, Raúl Montes-de-Oca2, Karel Sladký3
1Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista, Mexico
2Departamento de Matemáticas, Universidad Autónoma Metropolitana, México, Mexico
3Institute of Information Theory and Automation, Praha 8, Czech Republic

Tóm tắt

This note deals with Markov decision chains evolving on a denumerable state space. Under standard continuity-compactness requirements, an explicit example is provided to show that, with respect to a strong sample-path average reward criterion, the Lyapunov function condition does not ensure the existence of an optimal stationary policy.

Tài liệu tham khảo

Hordijk, A.: Dynamic Programming and Potential Theory. Mathematical Centre Tract, vol. 51. Mathematisch Centrum, Amsterdam (1974) Cavazos-Cadena, R., Montes-de-Oca, R.: Sample-path optimality in average Markov decision chains under a double Lyapunov function condition. In: Hernández-Hernández, D., Minjárez-Sosa, A. (eds.) Optimization, Control, and Applications of Stochastic Systems, In Honor of Onésimo Hernández-Lerma, pp. 31–57. Springer, New York (2012) Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York (1994) Thomas, L.C.: Connectedness conditions for denumerable state Markov decision processes. In: Hartley, R., Thomas, L.C., White, D.J. (eds.) Recent Developments in Markov Decision Processes, pp. 181–204. Academic Press, London (1980) Cavazos-Cadena, R., Fernández-Gaucherand, E.: Denumerable controlled Markov chains with average reward criterion: sample path optimality. Math. Methods Oper. Res. 41, 89–108 (1995) Lasserre, J.B.: Sample-path average optimality for Markov control processes. IEEE Trans. Autom. Control 44, 1966–1971 (1999) Hunt, F.Y.: Sample path optimality for a Markov optimization problems. Stoch. Process. Appl. 115, 769–779 (2005) Ross, S.M.: Applied Probability Models with Optimization Applications. Holden-Day, Oakland (1970)