Chi phí tối ưu và chính sách cho vấn đề thay thế Markov

Journal of Optimization Theory and Applications - Tập 71 Số 1 - Trang 105-126 - 1991
Sernik, E. L.1, Marcus, S. I.1
1Department of Electrical and Computer Engineering, University of Texas, Austin

Tóm tắt

Chúng tôi xem xét việc tính toán chi phí tối ưu và chính sách liên quan đến một bài toán thay thế Markov hai chiều với quan sát một phần, trong hai trường hợp chất lượng quan sát đặc biệt. Dựa vào các kết quả cấu trúc có sẵn cho chính sách tối ưu liên quan đến hai mô hình đặc thù này, chúng tôi chỉ ra rằng, trong cả hai trường hợp, hàm chi phí tối ưu giảm dần theo vô hạn là phi tuyến bậc nhất, và cung cấp các công thức để tính toán chi phí và chính sách. Một số ví dụ minh họa tính hữu ích của các kết quả này.

Từ khóa

#thay thế Markov #chi phí tối ưu #chính sách tối ưu #quan sát một phần #hàm chi phí giảm dần

Tài liệu tham khảo

citation_journal_title=Journal of Mathematical Analysis and Applications; citation_title=Optimal Control of Markov Processes with Incomplete State Information; citation_author=K. J. Åström; citation_volume=10; citation_publication_date=1965; citation_pages=174-205; citation_id=CR1 citation_journal_title=Naval Research Logistics Quarterly; citation_title=Bounds on the Optimal Cost for a Replacement Problem with Partial Observations; citation_author=C. C. White; citation_volume=26; citation_publication_date=1979; citation_pages=415-422; citation_id=CR2 citation_title=Dynamic Programming; citation_publication_date=1987; citation_id=CR3; citation_author=D. P. Bertsekas; citation_publisher=Prentice-Hall citation_journal_title=Operations Research; citation_title=Structural Results for Partially Observable Markov Decision Processes; citation_author=S. C. Albright; citation_volume=27; citation_publication_date=1979; citation_pages=1041-1053; citation_id=CR4 citation_journal_title=Management Science; citation_title=Quality Control under Markovian Deterioration; citation_author=S. M. Ross; citation_volume=17; citation_publication_date=1971; citation_pages=587-596; citation_id=CR5 citation_journal_title=Journal of Mathematical Analysis and Applications; citation_title=Transformation of Partially Observable Markov Decision Processes into Piecewise Linear Ones; citation_author=K. Sawaki; citation_volume=91; citation_publication_date=1983; citation_pages=112-118; citation_id=CR6 Sondik, E. J.,The Optimal Control of Partially Observable Markov Processes, PhD Thesis, Department of Electrical Engineering Systems, Stanford University, 1971. citation_journal_title=Operations Research; citation_title=The Optimal Control of Partially Observable Markov Decision Processes over the Infinite Horizon: Discounted Costs; citation_author=E. J. Sondik; citation_volume=26; citation_publication_date=1978; citation_pages=282-304; citation_id=CR8 citation_journal_title=Journal of Applied Probability; citation_title=Computing Optimal Control Policies—Two Actions; citation_author=R. C. Wang; citation_volume=13; citation_publication_date=1976; citation_pages=826-832; citation_id=CR9 citation_journal_title=Journal of Applied Probability; citation_title=Optimal Replacement Policy with Unobservable States; citation_author=R. C. Wang; citation_volume=14; citation_publication_date=1977; citation_pages=340-348; citation_id=CR10 citation_journal_title=Management Science; citation_title=A Markov Quality Control Process Subject to Partial Observation; citation_author=C. C. White; citation_volume=23; citation_publication_date=1977; citation_pages=843-852; citation_id=CR11 citation_journal_title=Journal of the Operational Research Society; citation_title=Optimal Inspection and Repair of a Production Process Subject to Deterioration; citation_author=C. C. White; citation_volume=29; citation_publication_date=1978; citation_pages=235-243; citation_id=CR12 Lovejoy, W. S.,Computationally Feasible Bounds for Partially Observed Markov Decision Processes, Research Paper No. 1024, Graduate School of Business, Stanford University, 1988. citation_journal_title=Operations Research; citation_title=Solution Procedures for Partially Observed Markov Decision Processes; citation_author=C. C. White, W. T. Scherer; citation_volume=37; citation_publication_date=1989; citation_pages=791-797; citation_id=CR14 citation_journal_title=Journal of Optimization Theory and Applications; citation_title=Adaptive Control of Discrete Discounted Markov Decision Chains; citation_author=O. Hernandez-Lerma, S. I. Marcus; citation_volume=46; citation_publication_date=1985; citation_pages=227-235; citation_id=CR15 citation_journal_title=Journal of Optimization Theory and Applications; citation_title=Adaptive Control of Markov Processes with Incomplete State Information and Unknown Parameters; citation_author=O. Hernandez-Lerma, S. I. Marcus; citation_volume=52; citation_publication_date=1987; citation_pages=227-241; citation_id=CR16 citation_journal_title=Journal of Applied Probability; citation_title=Optimal Stopping in a Partially Observable Binary-Valued Markov Chain with Costly Perfect Information; citation_author=G. E. Monahan; citation_volume=19; citation_publication_date=1982; citation_pages=72-81; citation_id=CR17 citation_journal_title=IEEE Transactions on Automatic Control; citation_title=On the Optimal Solution of the One-Armed Bandit Adaptive Control Problem; citation_author=P. R. Kumar, T. I. Seidman; citation_volume=26; citation_publication_date=1981; citation_pages=1176-1184; citation_id=CR18 citation_journal_title=Communications in Statistics—Stochastic Models; citation_title=Optimal Inspection Policies for Standby Systems; citation_author=L. C. Thomas, P. A. Jacobs, D. P. Gaver; citation_volume=3; citation_publication_date=1987; citation_pages=259-273; citation_id=CR19 citation_journal_title=Management Science; citation_title=Minimum-Cost Checking Using Imperfect Information; citation_author=S. M. Pollock; citation_volume=13; citation_publication_date=1967; citation_pages=454-465; citation_id=CR20 citation_journal_title=Accounting Review; citation_title=Optimal Internal Audit Timing; citation_author=J. S. Hughes; citation_volume=52; citation_publication_date=1977; citation_pages=56-68; citation_id=CR21 citation_journal_title=Operations Research; citation_title=A Note on Quality Control under Markovian Deterioration; citation_author=J. S. Hughes; citation_volume=28; citation_publication_date=1980; citation_pages=421-424; citation_id=CR22 citation_title=Stochastic Optimal Control: The Discrete Time Case; citation_publication_date=1978; citation_id=CR23; citation_author=D. P. Bertsekas; citation_author=S. E. Shreve; citation_publisher=Academic Press Fernandez-Gaucherand, E., Arapostathis, A., andMarcus, S. I.,On The Adaptive Control of a Partially Observable Markov Decision Process, Proceedings of the 27th Conference on Decision and Control, Austin, Texas, pp. 1204–1210, 1988. citation_journal_title=Management Science; citation_title=A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms; citation_author=G. E. Monahan; citation_volume=28; citation_publication_date=1982; citation_pages=1-16; citation_id=CR25 citation_journal_title=European Journal of Operational Research; citation_title=Markov Decision Processes; citation_author=C. C. White, D. J. White; citation_volume=39; citation_publication_date=1989; citation_pages=1-16; citation_id=CR26 citation_journal_title=Journal of the Operations Research Society of Japan; citation_title=Optimal Control for Partially Observable Markov Decision Processes over an Infinite Horizon; citation_author=K. Sawaki, A. Ichikawa; citation_volume=21; citation_publication_date=1978; citation_pages=1-15; citation_id=CR27 citation_title=Bayesian Decision Problems and Markov Chains; citation_publication_date=1967; citation_id=CR28; citation_author=J. J. Martin; citation_publisher=John Wiley and Sons Sernik, E. L., andMarcus, S. I.,Comments on the Sensitivity of the Optimal Cost and the Optimal Policy for a Discrete Markov Decision Process, Proceedings of the 27th Annual Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, pp. 935–944, 1989. citation_journal_title=Annals of Operations Research; citation_title=On the Computation of the Optimal Cost Function for Discrete Time Markov Models with Partial Observations; citation_author=E. L. Sernik, S. I. Marcus; citation_volume=29; citation_publication_date=1991; citation_pages=471-512; citation_id=CR30 citation_title=Discounted and Undiscounted Value Iteration in Markov Decision Problems: A Survey; citation_inbook_title=Dynamic Programming and Its Applications; citation_publication_date=1979; citation_pages=23-52; citation_id=CR31; citation_author=A. Federgruen; citation_author=P. J. Schweitzer; citation_publisher=Academic Press Fernandez-Gaucherand, E., Arapostathis, A., andMarcus, S. I.,On Partially Observable Markov Decision Processes with an Average Cost Criterion, Proceedings of the 28th Conference on Decision and Control, Tampa, Florida, pp. 1267–1272, 1989. citation_journal_title=Automation and Remote Control; citation_title=Optimal Control of a Partially Observable Discrete Markov Process; citation_author=V. A. Andriyanov, I. A. Kogan, G. A. Umnov; citation_volume=4; citation_publication_date=1980; citation_pages=555-561; citation_id=CR33