Optimal learning with a local parametric belief model

Journal of Global Optimization - Tập 63 - Trang 401-425 - 2015

Bolong Cheng¹, Arta Jamshidi², Warren B. Powell²

¹Department of Electrical Engineering Princeton University Princeton (USA)

²Department of Operations Research and Financial Engineering, Princeton University, Princeton, USA

Tóm tắt

We are interested in maximizing smooth functions where observations are noisy and expensive to compute, as might arise in computer simulations or laboratory experimentations. We derive a knowledge gradient policy, which chooses measurements which maximize the expected value of information, while using a locally parametric belief model that uses linear approximations with radial basis functions. The method uses a compact representation of the function which avoids storing the entire history, as is typically required by nonparametric methods. Our technique uses the expected value of a measurement in terms of its ability to improve our estimate of the optimum, capturing correlations in our beliefs about neighboring regions of the function, without posing any assumptions on the global shape of the underlying function a priori. Experimental work suggests that the method adapts to a range of arbitrary, continuous functions, and appears to reliably find the optimal solution. Moreover, the policy is shown to be asymptotically optimal.

Tài liệu tham khảo

Agrawal, R.: The continuum-armed bandit problem. SIAM J. Control Optim. 33(6), 1926–1951 (1995) Barton, R.R., Meckesheimer, M.: Chapter 18: Metamodel-based simulation optimization. In: Henderson, S.G., Nelson, B.L. (eds.) Simulation, Handbooks in Operations Research and Management Science, vol. 13, pp. 535–574. Elsevier, Amsterdam (2006) Barut, E., Powell, W.B.: Optimal learning for sequential sampling with non-parametric beliefs. J. Glob. Optim. 58(3), 517–543 (2014). doi:10.1007/s10898-013-0050-5 Branin, F.H.: Widely convergent method for finding multiple solutions of simultaneous nonlinear equations. IBM J. Res. Dev. 16(5), 504–522 (1972) Calvin, J., Žilinskas, A.: One-dimensional global optimization for observations with noise. Comput. Math. Appl. 50(1–2), 157–169 (2005). doi:10.1016/j.camwa.2004.12.014. Cited by 8 Chang, F., Lai, T.L.: Optimal stopping and dynamic allocation. Adv. Appl. Probab. 19(4), 829–853 (1987) Chick, S.E., Gans, N.: Economic analysis of simulation selection problems. Manag. Sci. 55(3), 421–437 (2009) Cochran, W., Cox, G.: Experimental Deisgns. Wiley, New York (1957) Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. J. Artif. Intell. Res. 4, 129–145 (1996) Eubank, R.L.: Spline Smoothing and Nonparametric Regression, 2nd edn. Marcel Dekker, New York (1988) Fan, J., Gijbels, I.: Local Polynomial Modelling and Its Applications. Monographs on Statistics and Applied Probability Series. Chapman and Hall/CRC, London (1996) Frazier, P.I., Powell, W.B., Dayanik, S.: A knowledge-gradient policy for sequential information collection. SIAM J. Control Optim. 47(5), 2410–2439 (2008) Frazier, P.I., Powell, W.B., Dayanik, S.: The knowledge-gradient policy for correlated normal beliefs. INFORMS J. Comput. 21(4), 599–613 (2009) Fu, M.C.: Chapter 19: Gradient estimation. In: Henderson, S.G., Nelson, B.L. (eds.) Simulation, Handbooks in Operations Research and Management Science, vol. 13, pp. 575–616. Elsevier, Amsterdam (2006) Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis, 2nd edn. Chapman and Hall/CRC, London (2004) Gibbs, M.N.: Bayesian Gaussian processes for regression and classification. Ph.D. thesis, University of Cambridge (1997) Ginebra, J., Clayton, M.K.: Response surface bandits. J. R. Stat. Soc. Ser. B (Methodol.) 57(4), 771–784 (1995) Gittins, J.C.: Bandit processes and dynamic allocation indices. J. R. Stat. Soc. Ser. B (Methodol.) 41(2), 148–177 (1979) Gupta, S.S., Miescke, K.J.: Bayesian look ahead one-stage sampling allocations for selection of the best population. J. Stat. Plan. Inference 54(2), 229–244 (1996) Hannah, L.A., Blei, D.M., Powell, W.B.: Dirichlet process mixtures of generalized linear models. J. Mach. Learn. Res. 12, 1923–1953 (2011) Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall PTR, Upper Saddle River (1998) Huang, D., Allen, T., Notz, W., Zeng, N.: Global optimization of stochastic black-box systems via sequential kriging meta-models. J. Glob. Optim. 34(3), 441–466 (2006) Jamshidi, A.A., Kirby, M.J.: Towards a black box algorithm for nonlinear function approximation over high-dimensional domains. SIAM J. Sci. Comput. 29(3), 941–964 (2007) Jamshidi, A.A., Powell, W.B.: A recursive local polynomial approximation method using dirichlet clouds and radial basis functions. Working Paper. Princeton University, New Jersey (2013) Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Glob. Optim. 13(4), 455–492 (1998). doi:10.1023/A:1008306431147 Jones, R., Lee, Y.C., Barnes, C.W., Flake, G., Lee, K., Lewis, P., Qian, S.: Function approximation and time series prediction with neural networks. In: IJCNN International Joint Conference on Neural Networks, vol. 1, pp. 649–665 (1990) Knuth, D.E.: The Art of Computer Programming, vol. 2: Seminumerical Algorithms, 3rd edn. Addison-Wesley, Boston (1998) Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Fluids Eng. 86(1), 97–106 (1964). doi:10.1115/1.3653121 Mes, M.R., Powell, W.B., Frazier, P.I.: Hierarchical knowledge gradient for sequential sampling. J. Mach. Learn. Res. 12, 2931–2974 (2011) Montgomery, D.C., Peck, E.A., Vining, G.G.: Introduction to Linear Regression Analysis, 4th edn. Wiley, Hoboken (2006) Muller, H.G.: Weighted local regression and kernel methods for nonparametric curve fitting. J. Am. Stat. Assoc. 82(397), 231–238 (1987) Negoescu, D.M., Frazier, P.I., Powell, W.B.: The knowledge-gradient algorithm for sequencing experiments in drug discovery. INFORMS J. Comput. 23(3), 346–363 (2010) Nelson, B.L., Swann, J., Goldsman, D., Song, W.: Simple procedures for selecting the best simulated system when the number of alternatives is large. Oper. Res. 49, 950–963 (1999) Powell, W.B., Ryzhov, I.O.: Optimal Learning. Wiley, London (2012) Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951) Ryzhov, I.O., Powell, W.B., Frazier, P.: The knowledge gradient algorithm for a general class of online learning problems. Oper. Res. 60(1), 180–195 (2012) Scott, W., Frazier, P., Powell, W.B.: The correlated knowledge gradient for simulation optimization of continuous parameters using Gaussian process regression. SIAM J. Optim. 21(3), 996–1026 (2011) Spall, J.C.: Introduction to Stochastic Search and Optimization, 1st edn. Wiley, New York (2003) Villemonteix, J., Vazquez, E., Walter, E.: An informational approach to the global optimization of expensive-to-evaluate functions. J. Glob. Optim. 44(4), 509–534 (2009) Žilinskas, A.: Two algorithms for one-dimensional multimodai minimization. Math. Oper. Stat. Ser. Optim. 12(1), 53–63 (1981). doi:10.1080/02331938108842707

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA