Robust Dynamic Programming

Mathematics of Operations Research - Tập 30 Số 2 - Trang 257-280 - 2005

Garud Iyengar¹

¹IEOR Department, Columbia University, New York, New York 10027

Tóm tắt

In this paper we propose a robust formulation for discrete time dynamic programming (DP). The objective of the robust formulation is to systematically mitigate the sensitivity of the DP optimal policy to ambiguity in the underlying transition probabilities. The ambiguity is modeled by associating a set of conditional measures with each state-action pair. Consequently, in the robust formulation each policy has a set of measures associated with it. We prove that when this set of measures has a certain “rectangularity” property, all of the main results for finite and infinite horizon DP extend to natural robust counterparts. We discuss techniques from Nilim and El Ghaoui [17] for constructing suitable sets of conditional measures that allow one to efficiently solve for the optimal robust policy. We also show that robust DP is equivalent to stochastic zero-sum games with perfect information.

Từ khóa

Tài liệu tham khảo

10.1007/978-1-4612-1336-9_17

10.1137/S1052623495291951

10.1287/moor.23.4.769

Bertsekas D. P., 1996, Neuro-Dynamic Programming

10.1111/j.1467-9965.1991.tb00002.x

10.1002/0471200611

10.1287/opre.51.6.850.24925

10.2307/1884324

10.1016/S0022-0531(03)00097-8

10.1016/0304-4068(89)90018-9

Gillette D., 1957, Contributions to the Theory of Games, 3, 179

10.1287/moor.28.1.1.14260

10.1257/aer.91.2.60

Littman M., 1994, Animals to Animats: SAB ’94, 238, 10.7551/mitpress/3117.003.0041

Nilim A., 2002, Oper. Res.

Nilim A., 2003, Proc. NIPS

Nowak A. S., 1984, Probab. Math. Statist., 4, 13

10.1002/9780470316887

Satia J. K. Markovian decision process with uncertain transition matrices or/and probabilistic observation of states (1968) Ph.D. thesis, Stanford University, Stanford, CA

10.1287/opre.21.3.728

10.1287/opre.42.4.739

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Công cụ kiểm tra chính tả và thể thức Viver

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA