Solving H-horizon, stationary Markov decision problems in time proportional to log(H)
Tài liệu tham khảo
Bertsekas, 1987
Bertsekas, 1989
Cook, 1981, Towards a complexity theory of synchronous parallel computation, Enseign. Math., 2, 99
Federgruen, 1978, Discounted and undiscounted value-iteration in Markov decision problems: A survey, 23
Fox, 1989
Garey, 1979
Householder, 1964
Howard, 1960
Karmarkar, 1984, A new polynomial-time algorithm for linear programming, Combinatorica, 4, 373, 10.1007/BF02579150
Lewis, 1981
1989
Orlin, 1981, The complexity of dynamic languages and dynamic optimization problems, 218
Papadimitriou, 1985, Games against nature, J. Comput. System Sci., 31, 288, 10.1016/0022-0000(85)90045-5
Papadimitriou, 1982
Papadimitriou, 1987, The complexity of Markov decision processes, Math. Oper. Res., 12, 441, 10.1287/moor.12.3.441
Parberry, 1987
Veinott, 1969, Discrete dynamic programming with sensitive discount optimality criterion, Ann. Math. Stat., 40, 1635, 10.1214/aoms/1177697379
White, 1978, Elimination of nonoptimal actions in Markov decision processes, 131