On maximizing probabilities for over-performing a target for Markov decision processes

Springer Science and Business Media LLC - Trang 1-29 - 2023

Tanhao Huang¹, Yanan Dai¹, Jinwen Chen¹

¹Department of Mathematics, Tsinghua University, Beijing, China

Tóm tắt

This paper studies the dual relation between risk-sensitive control and large deviation control of maximizing the probability for out-performing a target for Markov Decision Processes. To derive the desired duality, we apply a non-linear extension of the Krein-Rutman Theorem to characterize the optimal risk-sensitive value and prove that an optimal policy exists which is stationary and deterministic. The right-hand side derivative of this value function is used to characterize the specific targets which make the duality to hold. It is proved that the optimal policy for the “out-performing” probability can be approximated by the optimal one for the risk-sensitive control. The range of the (right-hand, left-hand side) derivative of the optimal risk-sensitive value function plays an important role. Some essential differences between these two types of optimal control problems are presented.

Tài liệu tham khảo

Alsheikh MA, Hoang DT, Niyato D et al (2015) Markov decision processes with applications in wireless sensor networks: a survey. IEEE Commun Surv Tutor 17(3):1239–1267 Anantharam V, Borkar VS (2017) A variational formula for risk-sensitive reward. SIAM J Control Optim 55(2):961–988 Boucherie RJ, Van Dijk NM (2017) Markov decision processes in practice. Springer, Berlin Dembo A, Zeitouni O (2010) Large deviations techniques and applications. Stochastic modelling and applied probability. Springer, Berlin, p 38 Di Masi GB, Stettner L (1999) Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J Control Optim 38(1):61–78 Dupuis P, Ellis RS (1997) A weak convergence approach to the theory of large deviations. Wiley, New York Feinberg EA, Shwartz A (2002) Handbook of Markov decision processes. Springer, Heidelberg Fleming WH, Hernandez-Hernandez D (1997) Risk-sensitive control of finite state machines on an infinite horizon I. SIAM J Control Optim 35(5):1790–1810 Gosavi A (2006) A risk-sensitive approach to total productive maintenance. Automatica 42:1321–1330 Hata H, Nagai H, Sheu SJ (2010) Asymptotics of the probability minimizing a “down-side’’ risk. Ann Appl Probab 20(1):52–89 Jaskiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17(2):654–675 Nagai H (2012) Downside risk minimization via a large deviations approach. Ann Appl Probab 22(2):608–669 Ogiwara T (1995) Nonlinear Perron-Frobenius problem on an ordered Banach space. Jpn J Math 21(1):43–103 Pham H (2003) A large deviations approach to optimal long term investment. Financ Stoch 7(2):169–195 Pham H (2003) A risk-sensitive control dual approach to a large deviations control problem. Syst Control Lett 49(4):295–309 Piunovskiy A, Zhang Y (2010) Modern trends in controlled stochastic processes. Springer, Berlin Puhalskii AA (2011) On portfolio choice by maximizing the outperformance probability. Math Financ Int J Math Stat Financ Econ 21(1):145–167 Puhalskii AA (2019) On long term investment optimality. Appl Math Optim 80(1):1–62 Puterman ML (1994) Markov decision processes. Wiley, Amsterdam Rockafellar RT (1970) Convex analysis. Princeton University Press Stettner L (2004) Duality and risk sensitive portfolio optimization. Contemp Math 351:333–348 White DJ (1993) A survey of applications of Markov decision processes. J Opl Res Soc 44(2):1073–1096

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA