Convergence of probability measures and Markov decision models with incomplete information
Tóm tắt
Từ khóa
Tài liệu tham khảo
M. Aoki, “Optimal control of partially observable Markovian systems,” J. Franklin Inst. 280(5), 367–386 (1965).
N. Bäuerle and U. Rieder, Markov Decision Processes with Applications to Finance (Springer, Berlin, 2011).
A. Bensoussan, Stochastic Control of Partially Observable Systems (Cambridge Univ. Press, Cambridge, 1992).
D. P. Bertsekas and S. E. Shreve, Stochastic Optimal Control: The Discrete Time Case (Acad. Press, New York, 1978).
P. Billingsley, Convergence of Probability Measures (J. Wiley & Sons, New York, 1968).
E. B. Dynkin, “Controlled random sequences,” Teor. Veroyatn. Primen. 10(1), 3–18 (1965) [Theory Probab. Appl. 10, 1–14 (1965)].
E. A. Feinberg, P. O. Kasyanov, and M. Voorneveld, “Berge’s maximum theorem for noncompact image sets,” J. Math. Anal. Appl. 413(2), 1040–1046 (2014).
E. A. Feinberg, P. O. Kasyanov, and N. V. Zadoianchuk, “Average cost Markov decision processes with weakly continuous transition probabilities,” Math. Oper. Res. 37(4), 591–607 (2012).
E. A. Feinberg, P. O. Kasyanov, and N. V. Zadoianchuk, “Berge’s theorem for noncompact image sets,” J. Math. Anal. Appl. 397(1), 255–259 (2013).
E. A. Feinberg, P. O. Kasyanov, and M. Z. Zgurovsky, “Optimality conditions for total-cost partially observable Markov decision processes,” in Proc. 52th IEEE Conf. on Decision and Control and Eur. Control Conf., Florence, Italy, 2013 (IEEE, 2013), pp. 5716–5721.
E. A. Feinberg, P. O. Kasyanov, and M. Z. Zgurovsky, “Partially observable total-cost Markov decision processes with weakly continuous transition probabilities,” arXiv: 1401.2168 [math.OC].
E. A. Feinberg, P. O. Kasyanov, and M. Z. Zgurovsky, “Optimality conditions for partially observable Markov decision processes,” in Continuous and Distributed Systems: Theory and Applications, Ed. by M. Z. Zgurovsky and V. A. Sadovnichiy (Springer, Cham, 2014), pp. 251–264.
O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes: Basic Optimality Criteria (Springer, New York, 1996).
K. Hinderer, Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter (Springer, Berlin, 1970).
J. Jacod and A. N. Shiryaev, Limit Theorems for Stochastic Processes, 2nd ed. (Springer, Berlin, 2003).
Yu. M. Kabanov, R. Sh. Liptser, and A. N. Shiryaev, “Some limit theorems for simple point processes (a martingale approach),” Stochastics 3, 203–216 (1980).
R. Sh. Liptser and A. N. Shiryaev, Statistics of Random Processes: Nonlinear Filtering and Related Problems (Nauka, Moscow, 1974). Engl. transl.: Statistics of Random Processes, Vol. 1: General Theory, Vol. 2: Applications (Springer, New York, 1977, 1978).
D. Rhenius, “Incomplete information in Markovian decision models,” Ann. Stat. 2(6), 1327–1334 (1974).
Y. Sawaragi and T. Yoshikawa, “Discrete-time Markovian decision processes with incomplete state observation,” Ann. Math. Stat. 41(1), 78–86 (1970).
A. N. Shiryaev, “On the theory of decision functions and control by observation from incomplete data,” in Trans. Third Prague Conf. Inf. Theory, Stat. Decis. Funct., Random Processes, Liblice, 1962 (Publ. House Czech. Acad. Sci., Prague, 1964), pp. 657–681. Engl. transl.: Sel. Transl. Math. Stat. Probab. 6, 162-188 (1966).
A. N. Shiryaev, “Some new results in the theory of controlled random processes,” in Trans. Fourth Prague Conf. Inf. Theory, Stat. Decis. Funct., Random Processes, Prague, 1965 (Academia, Prague, 1967), pp. 131–203. Engl. transl.: Sel. Transl. Math. Stat. Probab. 8, 49–130 (1970).
R. D. Smallwood and E. J. Sondik, “The optimal control of partially observable Markov processes over a finite horizon,” Oper. Res. 21(5), 1071–1088 (1973).
E. J. Sondik, “The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs,” Oper. Res. 26(2), 282–304 (1978).
A. A. Yushkevich, “Reduction of a controlled Markov model with incomplete data to a problem with complete information in the case of Borel state and control space,” Teor. Veroyatn. Primen. 21(1), 152–157 (1976) [Theory Probab. Appl. 21, 153–158 (1976)].