Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo
Chi phí tối ưu và chính sách cho vấn đề thay thế Markov
Tóm tắt
Chúng tôi xem xét việc tính toán chi phí tối ưu và chính sách liên quan đến một bài toán thay thế Markov hai chiều với quan sát một phần, trong hai trường hợp chất lượng quan sát đặc biệt. Dựa vào các kết quả cấu trúc có sẵn cho chính sách tối ưu liên quan đến hai mô hình đặc thù này, chúng tôi chỉ ra rằng, trong cả hai trường hợp, hàm chi phí tối ưu giảm dần theo vô hạn là phi tuyến bậc nhất, và cung cấp các công thức để tính toán chi phí và chính sách. Một số ví dụ minh họa tính hữu ích của các kết quả này.
Từ khóa
#thay thế Markov #chi phí tối ưu #chính sách tối ưu #quan sát một phần #hàm chi phí giảm dầnTài liệu tham khảo
citation_journal_title=Journal of Mathematical Analysis and Applications; citation_title=Optimal Control of Markov Processes with Incomplete State Information; citation_author=K. J. Åström; citation_volume=10; citation_publication_date=1965; citation_pages=174-205; citation_id=CR1
citation_journal_title=Naval Research Logistics Quarterly; citation_title=Bounds on the Optimal Cost for a Replacement Problem with Partial Observations; citation_author=C. C. White; citation_volume=26; citation_publication_date=1979; citation_pages=415-422; citation_id=CR2
citation_title=Dynamic Programming; citation_publication_date=1987; citation_id=CR3; citation_author=D. P. Bertsekas; citation_publisher=Prentice-Hall
citation_journal_title=Operations Research; citation_title=Structural Results for Partially Observable Markov Decision Processes; citation_author=S. C. Albright; citation_volume=27; citation_publication_date=1979; citation_pages=1041-1053; citation_id=CR4
citation_journal_title=Management Science; citation_title=Quality Control under Markovian Deterioration; citation_author=S. M. Ross; citation_volume=17; citation_publication_date=1971; citation_pages=587-596; citation_id=CR5
citation_journal_title=Journal of Mathematical Analysis and Applications; citation_title=Transformation of Partially Observable Markov Decision Processes into Piecewise Linear Ones; citation_author=K. Sawaki; citation_volume=91; citation_publication_date=1983; citation_pages=112-118; citation_id=CR6
Sondik, E. J.,The Optimal Control of Partially Observable Markov Processes, PhD Thesis, Department of Electrical Engineering Systems, Stanford University, 1971.
citation_journal_title=Operations Research; citation_title=The Optimal Control of Partially Observable Markov Decision Processes over the Infinite Horizon: Discounted Costs; citation_author=E. J. Sondik; citation_volume=26; citation_publication_date=1978; citation_pages=282-304; citation_id=CR8
citation_journal_title=Journal of Applied Probability; citation_title=Computing Optimal Control Policies—Two Actions; citation_author=R. C. Wang; citation_volume=13; citation_publication_date=1976; citation_pages=826-832; citation_id=CR9
citation_journal_title=Journal of Applied Probability; citation_title=Optimal Replacement Policy with Unobservable States; citation_author=R. C. Wang; citation_volume=14; citation_publication_date=1977; citation_pages=340-348; citation_id=CR10
citation_journal_title=Management Science; citation_title=A Markov Quality Control Process Subject to Partial Observation; citation_author=C. C. White; citation_volume=23; citation_publication_date=1977; citation_pages=843-852; citation_id=CR11
citation_journal_title=Journal of the Operational Research Society; citation_title=Optimal Inspection and Repair of a Production Process Subject to Deterioration; citation_author=C. C. White; citation_volume=29; citation_publication_date=1978; citation_pages=235-243; citation_id=CR12
Lovejoy, W. S.,Computationally Feasible Bounds for Partially Observed Markov Decision Processes, Research Paper No. 1024, Graduate School of Business, Stanford University, 1988.
citation_journal_title=Operations Research; citation_title=Solution Procedures for Partially Observed Markov Decision Processes; citation_author=C. C. White, W. T. Scherer; citation_volume=37; citation_publication_date=1989; citation_pages=791-797; citation_id=CR14
citation_journal_title=Journal of Optimization Theory and Applications; citation_title=Adaptive Control of Discrete Discounted Markov Decision Chains; citation_author=O. Hernandez-Lerma, S. I. Marcus; citation_volume=46; citation_publication_date=1985; citation_pages=227-235; citation_id=CR15
citation_journal_title=Journal of Optimization Theory and Applications; citation_title=Adaptive Control of Markov Processes with Incomplete State Information and Unknown Parameters; citation_author=O. Hernandez-Lerma, S. I. Marcus; citation_volume=52; citation_publication_date=1987; citation_pages=227-241; citation_id=CR16
citation_journal_title=Journal of Applied Probability; citation_title=Optimal Stopping in a Partially Observable Binary-Valued Markov Chain with Costly Perfect Information; citation_author=G. E. Monahan; citation_volume=19; citation_publication_date=1982; citation_pages=72-81; citation_id=CR17
citation_journal_title=IEEE Transactions on Automatic Control; citation_title=On the Optimal Solution of the One-Armed Bandit Adaptive Control Problem; citation_author=P. R. Kumar, T. I. Seidman; citation_volume=26; citation_publication_date=1981; citation_pages=1176-1184; citation_id=CR18
citation_journal_title=Communications in Statistics—Stochastic Models; citation_title=Optimal Inspection Policies for Standby Systems; citation_author=L. C. Thomas, P. A. Jacobs, D. P. Gaver; citation_volume=3; citation_publication_date=1987; citation_pages=259-273; citation_id=CR19
citation_journal_title=Management Science; citation_title=Minimum-Cost Checking Using Imperfect Information; citation_author=S. M. Pollock; citation_volume=13; citation_publication_date=1967; citation_pages=454-465; citation_id=CR20
citation_journal_title=Accounting Review; citation_title=Optimal Internal Audit Timing; citation_author=J. S. Hughes; citation_volume=52; citation_publication_date=1977; citation_pages=56-68; citation_id=CR21
citation_journal_title=Operations Research; citation_title=A Note on Quality Control under Markovian Deterioration; citation_author=J. S. Hughes; citation_volume=28; citation_publication_date=1980; citation_pages=421-424; citation_id=CR22
citation_title=Stochastic Optimal Control: The Discrete Time Case; citation_publication_date=1978; citation_id=CR23; citation_author=D. P. Bertsekas; citation_author=S. E. Shreve; citation_publisher=Academic Press
Fernandez-Gaucherand, E., Arapostathis, A., andMarcus, S. I.,On The Adaptive Control of a Partially Observable Markov Decision Process, Proceedings of the 27th Conference on Decision and Control, Austin, Texas, pp. 1204–1210, 1988.
citation_journal_title=Management Science; citation_title=A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms; citation_author=G. E. Monahan; citation_volume=28; citation_publication_date=1982; citation_pages=1-16; citation_id=CR25
citation_journal_title=European Journal of Operational Research; citation_title=Markov Decision Processes; citation_author=C. C. White, D. J. White; citation_volume=39; citation_publication_date=1989; citation_pages=1-16; citation_id=CR26
citation_journal_title=Journal of the Operations Research Society of Japan; citation_title=Optimal Control for Partially Observable Markov Decision Processes over an Infinite Horizon; citation_author=K. Sawaki, A. Ichikawa; citation_volume=21; citation_publication_date=1978; citation_pages=1-15; citation_id=CR27
citation_title=Bayesian Decision Problems and Markov Chains; citation_publication_date=1967; citation_id=CR28; citation_author=J. J. Martin; citation_publisher=John Wiley and Sons
Sernik, E. L., andMarcus, S. I.,Comments on the Sensitivity of the Optimal Cost and the Optimal Policy for a Discrete Markov Decision Process, Proceedings of the 27th Annual Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, pp. 935–944, 1989.
citation_journal_title=Annals of Operations Research; citation_title=On the Computation of the Optimal Cost Function for Discrete Time Markov Models with Partial Observations; citation_author=E. L. Sernik, S. I. Marcus; citation_volume=29; citation_publication_date=1991; citation_pages=471-512; citation_id=CR30
citation_title=Discounted and Undiscounted Value Iteration in Markov Decision Problems: A Survey; citation_inbook_title=Dynamic Programming and Its Applications; citation_publication_date=1979; citation_pages=23-52; citation_id=CR31; citation_author=A. Federgruen; citation_author=P. J. Schweitzer; citation_publisher=Academic Press
Fernandez-Gaucherand, E., Arapostathis, A., andMarcus, S. I.,On Partially Observable Markov Decision Processes with an Average Cost Criterion, Proceedings of the 28th Conference on Decision and Control, Tampa, Florida, pp. 1267–1272, 1989.
citation_journal_title=Automation and Remote Control; citation_title=Optimal Control of a Partially Observable Discrete Markov Process; citation_author=V. A. Andriyanov, I. A. Kogan, G. A. Umnov; citation_volume=4; citation_publication_date=1980; citation_pages=555-561; citation_id=CR33
