Optimally solving Markov decision processes with total expected discounted reward function: Linear programming revisited
Tài liệu tham khảo
Agrawal, P., Signh, J. P., Alpcan, T., & Sharma, V. (2007). In 2007 IEEE international symposium world of wireless, mobile and multimedia networks.
Akselrod, D., & Kirubarajan, T. (2008). Modified value iteration algorithm and dynamic element matching based mdp for distributed data fusion and sensor management. In 2008 International conference on information fusion.
Alagoz, 2004, The optimal timing of living-donor liver transplantation, Management Science, 50, 1420, 10.1287/mnsc.1040.0287
Alagoz, 2007, Choosing among cadaveric and living-donor livers, Management Science, 53, 1702, 10.1287/mnsc.1070.0726
Alagoz, 2007, Determining the acceptance of cadaveric livers using an implicit model of the waiting list, Operations Research, 55, 24, 10.1287/opre.1060.0329
Al-Zubaidy, H., Talim, J., & Lambadaris, I. (2007). Dynamic scheduling in high speed downlink packet access networks: Heuristic approach. In 2007 Military communications conference.
Al-Zubaidy, 2010, Optimial scheduling in high-speed downlink packet access networks, ACM Transactions on Modeling and Computer Simulation, 21, 3:1, 10.1145/1870085.1870088
Arruda, 2011, Approximate dynamic programming via direct search space of value function approximations, European Journal of Operational Research, 211, 343, 10.1016/j.ejor.2010.11.019
Asadian, A., Kermani, M. R., & Patel, R. V. (2010). Accelerated needle steering using partitioned value iteration, In 2010 American control conference.
Bello, D., & Riano, G. (2006). Linear programming solvers for markov decision processes. In 2006 IEEE systems and information engineering design symposium.
Bixby, 2002, Solving real-world linear programs: A decade and more of progress, Operations Research, 50, 3, 10.1287/opre.50.1.3.17780
Buongiorno, 2011, Further generalization of faustmann’s formula for stochastic interest rates, Journal of Forest Economics, 17, 248, 10.1016/j.jfe.2011.03.002
Chamberland, J. F., Ko, Y. M., & Gautam, N. (2007). Optimal policies for control of peers in online multimedia services. In 2007 IEEE conference on decision and control.
Chang, H. S. & Chong, E. K. P. (2005). On solving controlled markov set-chains via multi-policy improvement. In 2005 IEEE conference on decision and control, European control conference.
Chen, M., & Cheng, C. (2007). Sensitivity analysis for the optimal minimal repair/replacement policies under the framework of Markov decision process. In 2007 IEEM international conference on industrial engineering and engineering management.
Chen, 2011, Indirect reciprocity game modelling for cooperation stimulation in cognitive networks, IEEE Transactions on Communications, 59, 159, 10.1109/TCOMM.2010.110310.100143
Demmel, 1999, A supernodal approach to sparse partial pivoting, SIAM Journal on Matrix Analysis and Applications, 20, 720, 10.1137/S0895479895291765
D’Epenoux, 1963, A probabilistic production and inventory problem, Management Science, 10, 98, 10.1287/mnsc.10.1.98
Erenay, 2014, Optimizing colonoscopy screening for colorectal cancer prevention and surveillance, Manufacturing and Service Operations Management, 16, 381, 10.1287/msom.2014.0484
Farran, 2009, Comparative analysis of life-cycle costing for rehabilitating infrastructure systems, Journal of Performance of Constructed Facilities, 23, 320, 10.1061/(ASCE)CF.1943-5509.0000038
Farrokh, 2009, Optimal adaptive modulation and coding with switching costs, IEEE Transactions on Communications, 57, 697, 10.1109/TCOMM.2009.03.070115
Flapper, 2012, Control of a production-inventory system with returns under imperfect advance return information, European Journal of Operational Research, 218, 392, 10.1016/j.ejor.2011.10.051
Glazebrook, 2005, Index policies for the maintenance of a collection of machines by a set of repairmen, European Journal of Operational Research, 165, 267, 10.1016/j.ejor.2004.01.036
Grizzle, 2008, Shortest path stochastic control for hybrid electric vehicles, Internation Journal of Robust and Nonlinear Control, 18, 1409, 10.1002/rnc.1288
Idoumghar, L., & Schott, R. (2006). A new hybrid ga-mdp algorithm for the frequency assignment problem. In 2006 IEEE international conference on tools with artificial intelligence.
Kallenberg, 1983
Kuppuswamy, 2005, On subscription admission control for network service provision, IEEE Communications Letters, 9, 66, 10.1109/LCOMM.2005.1375244
Kurt, 2011, The structure of optimal statin initiation policies for patients with Type 2 diabetes, IIE Transactions on Healthcare Systems Engineering, 1, 49, 10.1080/19488300.2010.550180
Kurt, 2010, Optimally maintaining a markovian deteriorating system with limited imperfect repairs, European Journal of Operational Reserach, 205, 368, 10.1016/j.ejor.2010.01.009
Le Ny, J., & Feron, E. (2006). Restless bandits with swtiching costs: Linear programming relaxations, performance bounds and limited lookahead policies. In 2006 American control conference.
Littman, M. L., Dean, T. L., & Kaelbling, L. P. (1995). On the complexity of solving Markov decision problems. In Proceedings of the eleventh conference on uncertainty in artificial intelligence (pp. 394–402). Citeseer.
Min, 2010, An eleective surgery scheduling problem considering patient priority, Computers and Operations Research, 37, 1091, 10.1016/j.cor.2009.09.016
Morton, 1971, On the asymptotic convergence rate of cost differences for Markovian decision processes, Operations Research, 19, 244, 10.1287/opre.19.1.244
Mosharaf, 2005, Optimal resource allocation and fairness control in all-optical wdm networks, IEEE Journal on Selected Areas in Communications, 23, 1496, 10.1109/JSAC.2005.851791
Powell, 2007
Puterman, 1994
Puterman, 1978, Modified policy iteration algorithms for discounted Markov decision problems, Management Science, 24, 1127, 10.1287/mnsc.24.11.1127
Rezaei Yousefi, 2012, Optimal intervention strategies for therapeutic methods with fixed-length duration of drug effectiveness, IEEE Transactions on Signal Processing, PP
Sandıkçı, 2008, Estimating the patients price of privacy in liver transplantation, Operations Research, 56, 1393, 10.1287/opre.1080.0648
Schaefer, 2004, Modeling medical treatment using Markov decision processes, 597
Sharna, S. A., Amin, M. R., & Murshed, M. (2011). Call admission control policy for multiclass traffic in heterogeneous wireless networks. In 2011 International symposium on communications and information technologies.
Shechter, 2008, The optimal time to initiate HIV therapy under ordered health states, Operations Research, 56, 20, 10.1287/opre.1070.0480
Stevens-Navarro, 2008, An mdp-based vertical handoff decision algorithm for heterogeneous wireless networks, IEEE Transactions on Vehicular Technology, 57, 1243, 10.1109/TVT.2007.907072
Sun, 2011, A constrained mdp-based vertical handoff decision algorithm for 4g heterogenous wireless networks, Wireless Networks, 17, 1063, 10.1007/s11276-011-0335-x
Viet, 2012, Using markov decision processes to define an adaptive strategy to control the spread of an animal disease, Computers and Electronics in Agriculture, 80, 71, 10.1016/j.compag.2011.10.015
Wang, L., & Schonfeld, D. (2010). Game theoretic model for control of gene regulatory networks. In 2010 International conference on acoustics speech and signal processing.
White, 1993, Markov decision processes: Discounted expected reward or average expected reward?, Journal of Mathematical Analysis and Applications, 172, 375, 10.1006/jmaa.1993.1031
Ye, Y. (2015). The simplex method is strongly polynomial for the Markov decision problem with a fixed discount rate. Working paper, <http://www.stanford.edu/yyye/simplexmdp.pdf> Accessed 10.02.15.
Zobel, 2005, An empirical study of policy confergence in markov decision process value iteration, Computers and Operations Research, 32, 127, 10.1016/S0305-0548(03)00207-7