Convergence of probability measures and Markov decision models with incomplete information

Proceedings of the Steklov Institute of Mathematics - Tập 287 Số 1 - Trang 96-117 - 2014
Eugene A. Feinberg1, Pavlo O. Kasyanov2, Michael Z. Zgurovsky2
1Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, USA
2Institute for Applied System Analysis, National Technical University of Ukraine “Kyiv Polytechnic Institute”, Kyiv, Ukraine

Tóm tắt

Từ khóa


Tài liệu tham khảo

M. Aoki, “Optimal control of partially observable Markovian systems,” J. Franklin Inst. 280(5), 367–386 (1965).

N. Bäuerle and U. Rieder, Markov Decision Processes with Applications to Finance (Springer, Berlin, 2011).

A. Bensoussan, Stochastic Control of Partially Observable Systems (Cambridge Univ. Press, Cambridge, 1992).

D. P. Bertsekas and S. E. Shreve, Stochastic Optimal Control: The Discrete Time Case (Acad. Press, New York, 1978).

P. Billingsley, Convergence of Probability Measures (J. Wiley & Sons, New York, 1968).

V. I. Bogachev, Measure Theory (Springer, Berlin, 2007), Vol. 2.

D. L. Cohn, Measure Theory (Springer, New York, 2013).

E. B. Dynkin, “Controlled random sequences,” Teor. Veroyatn. Primen. 10(1), 3–18 (1965) [Theory Probab. Appl. 10, 1–14 (1965)].

E. B. Dynkin and A. A. Yushkevich, Controlled Markov Processes (Springer, New York, 1979).

E. A. Feinberg, P. O. Kasyanov, and M. Voorneveld, “Berge’s maximum theorem for noncompact image sets,” J. Math. Anal. Appl. 413(2), 1040–1046 (2014).

E. A. Feinberg, P. O. Kasyanov, and N. V. Zadoianchuk, “Average cost Markov decision processes with weakly continuous transition probabilities,” Math. Oper. Res. 37(4), 591–607 (2012).

E. A. Feinberg, P. O. Kasyanov, and N. V. Zadoianchuk, “Berge’s theorem for noncompact image sets,” J. Math. Anal. Appl. 397(1), 255–259 (2013).

E. A. Feinberg, P. O. Kasyanov, and M. Z. Zgurovsky, “Optimality conditions for total-cost partially observable Markov decision processes,” in Proc. 52th IEEE Conf. on Decision and Control and Eur. Control Conf., Florence, Italy, 2013 (IEEE, 2013), pp. 5716–5721.

E. A. Feinberg, P. O. Kasyanov, and M. Z. Zgurovsky, “Partially observable total-cost Markov decision processes with weakly continuous transition probabilities,” arXiv: 1401.2168 [math.OC].

E. A. Feinberg, P. O. Kasyanov, and M. Z. Zgurovsky, “Optimality conditions for partially observable Markov decision processes,” in Continuous and Distributed Systems: Theory and Applications, Ed. by M. Z. Zgurovsky and V. A. Sadovnichiy (Springer, Cham, 2014), pp. 251–264.

O. Hernández-Lerma, Adaptive Markov Control Processes (Springer, New York, 1989).

O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes: Basic Optimality Criteria (Springer, New York, 1996).

K. Hinderer, Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter (Springer, Berlin, 1970).

J. Jacod and A. N. Shiryaev, Limit Theorems for Stochastic Processes, 2nd ed. (Springer, Berlin, 2003).

Yu. M. Kabanov, R. Sh. Liptser, and A. N. Shiryaev, “Some limit theorems for simple point processes (a martingale approach),” Stochastics 3, 203–216 (1980).

R. Sh. Liptser and A. N. Shiryaev, Statistics of Random Processes: Nonlinear Filtering and Related Problems (Nauka, Moscow, 1974). Engl. transl.: Statistics of Random Processes, Vol. 1: General Theory, Vol. 2: Applications (Springer, New York, 1977, 1978).

D. Rhenius, “Incomplete information in Markovian decision models,” Ann. Stat. 2(6), 1327–1334 (1974).

U. Rieder, “Bayesian dynamic programming,” Adv. Appl. Probab. 7(2), 330–348 (1975).

Y. Sawaragi and T. Yoshikawa, “Discrete-time Markovian decision processes with incomplete state observation,” Ann. Math. Stat. 41(1), 78–86 (1970).

A. N. Shiryaev, “On the theory of decision functions and control by observation from incomplete data,” in Trans. Third Prague Conf. Inf. Theory, Stat. Decis. Funct., Random Processes, Liblice, 1962 (Publ. House Czech. Acad. Sci., Prague, 1964), pp. 657–681. Engl. transl.: Sel. Transl. Math. Stat. Probab. 6, 162-188 (1966).

A. N. Shiryaev, “Some new results in the theory of controlled random processes,” in Trans. Fourth Prague Conf. Inf. Theory, Stat. Decis. Funct., Random Processes, Prague, 1965 (Academia, Prague, 1967), pp. 131–203. Engl. transl.: Sel. Transl. Math. Stat. Probab. 8, 49–130 (1970).

A. N. Shiryaev, Probability, 2nd ed. (Springer, New York, 1996).

R. D. Smallwood and E. J. Sondik, “The optimal control of partially observable Markov processes over a finite horizon,” Oper. Res. 21(5), 1071–1088 (1973).

E. J. Sondik, “The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs,” Oper. Res. 26(2), 282–304 (1978).

C. Striebel, Optimal Control of Discrete Time Stochastic Systems (Springer, Berlin, 1975).

A. A. Yushkevich, “Reduction of a controlled Markov model with incomplete data to a problem with complete information in the case of Borel state and control space,” Teor. Veroyatn. Primen. 21(1), 152–157 (1976) [Theory Probab. Appl. 21, 153–158 (1976)].