Switched diffusion processes for non-convex optimization and saddle points search

Statistics and Computing - Tập 33 - Trang 1-16 - 2023
Lucas Journel1, Pierre Monmarché1
1Laboratoire Jacques-Louis Lions, Sorbonne Universite, Paris, France

Tóm tắt

We introduce and investigate stochastic processes designed to find local minimizers and saddle points of non-convex functions, exploring the landscape more efficiently than the standard noisy gradient descent. The processes switch between two behaviours, a noisy gradient descent and a noisy saddle point search. It is proven to be well-defined and to converge to a stationary distribution in the long time. Numerical experiments are provided on low-dimensional toy models and for Lennard–Jones clusters.

Tài liệu tham khảo

Alfonsi, A., Coyaud, R., Ehrlacher, V.: Constrained overdamped Langevin dynamics for symmetric multimarginal optimal transportation. Math. Models Methods Appl. Sci. 32(03), 403–455 (2022). https://doi.org/10.1142/S0218202522500105 Arnold, L., Kliemann, W.: On unique ergodicity for degenerate diffusions. Stochastics 21(1), 41–61 (1987). https://doi.org/10.1080/17442508708833450 Bardet, J.B., Guérin, H., Malrieu, F.: Long time behavior of diffusions with Markov switching. ALEA Lat. Am. J. Probab. Math. Stat. 7, 151–170 (2009) Bofill, J.M., Quapp, W., Bernuz, E.: Some remarks on the model of the extended gentlest ascent dynamics. J. Math. Chem. 53(1), 41–57 (2015). https://doi.org/10.1007/s10910-014-0409-y Bogachev, V.I., Krylov, N.V., Röckner, M., et al.: Fokker–Planck–Kolmogorov equations, Mathematical Surveys and Monographs, vol. 207. American Mathematical Society, Providence, RI (2015). https://doi.org/10.1090/surv/207 Bonfanti, S., Kob, W.: Methods to locate saddle points in complex landscapes. J. Chem. Phys. 147(20), 204104 (2017). https://doi.org/10.1063/1.5012271 de Saporta, B., Yao, J.F.: Tail of a linear diffusion with Markov switching. Ann. Appl. Probab. 15(1B), 992–1018 (2005). https://doi.org/10.1214/105051604000000828 Dittner, M., Hartke, B.: Conquering the hard cases of Lennard–Jones clusters with simple recipes. Comput. Theor. Chem. 1107, 7–13 (2017). https://doi.org/10.1016/j.comptc.2016.09.032 . https://www.sciencedirect.com/science/article/pii/S2210271X1630384X. Structure prediction of nanoclusters from global optimization techniques: computational strategies Du, N.H., Hening, A., Nguyen, D.H., et al.: Dynamical systems under random perturbations with fast switching and slow diffusion: hyperbolic equilibria and stable limit cycles. J. Differ. Equ. 293, 313–358 (2021). https://doi.org/10.1016/j.jde.2021.05.032 Duncan, J., Wu, Q., Promislow, K., et al.: Biased gradient squared descent saddle point finding method. J. Chem. Phys. 140(19), 194102 (2014) E, W., Zhou, X.: The gentlest ascent dynamics. Nonlinearity 24(6), 1831–1842 (2011). https://doi.org/10.1088/0951-7715/24/6/008 E, W., Ren, W., Vanden-Eijnden, E.: String method for the study of rare events. Phys. Rev. B (2002). https://doi.org/10.1103/physrevb.66.052301 Freidlin, M., Wentzell, A.: Random Perturbations of Dynamical Systems. Springer (2012) Gao, W., Leng, J., Zhou, X.: An iterative minimization formulation for saddle point search. SIAM J. Numer. Anal. 53(4), 1786–1805 (2015). https://doi.org/10.1137/130930339 Gu, S., Zhou, X.: Simplified gentlest ascent dynamics for saddle points in non-gradient systems. Chaos: Interdiscip. J. Nonlinear Sci. 28(12), 123106 (2018). https://doi.org/10.1063/1.5046819 Guillin, A.: Averaging principle of SDE with small diffusion: moderate deviations. Ann. Probab. 31(1), 413–443 (2003). https://doi.org/10.1214/aop/1046294316 Hairer, M., Mattingly, J.C.: Yet Another Look at Harris’ Ergodic Theorem for Markov Chains. Birkhäuser, Basel (2011). https://doi.org/10.1007/978-3-0348-0021-1_7 Henkelman, G., Jóhannesson, G., Jónsson, H.: Methods for finding saddle points and minimum energy paths. In: Theoretical Methods in Condensed Phase Chemistry, pp. 269–302. Springer (2002) Holley, R.A., Kusuoka, S., Stroock, D.W.: Asymptotics of the spectral gap with applications to the theory of simulated annealing. J. Funct. Anal. 83(2), 333–347 (1989). https://doi.org/10.1016/0022-1236(89)90023-2 Huang, G., Mandjes, M., Spreij, P.: Large deviations for Markov-modulated diffusion processes with rapid switching. Stoch. Process. Appl. 126(6), 1785–1818 (2016). https://doi.org/10.1016/j.spa.2015.12.005 Journel, L., Monmarché, P.: Convergence of the kinetic annealing for general potentials. Electron. J. Probab. 27, 1–37 (2022). https://doi.org/10.1214/22-EJP891 Lelievre, T., Stoltz, G.: Partial differential equations and stochastic methods in molecular dynamics. Acta Numer. 25, 681–880 (2016) Lelièvre, T., Rousset, M., Stoltz, G.: Free Energy Computations: A Mathematical Perspective. Imperial College Press (2010) Levitt, A., Ortner, C.: Convergence and cycling in walker-type saddle search algorithms. SIAM J. Numer. Anal. 55(5), 2204–2227 (2017). https://doi.org/10.1137/16M1087199 Lindskog, F., Majumder, A.P.: Exact long time behavior of some regime switching stochastic processes. Bernoulli 26(4), 2572–2604 (2020). https://doi.org/10.3150/20-BEJ1196 Luo, Y., Zheng, X., Cheng, X., et al.: Convergence analysis of discrete high-index saddle dynamics. SIAM J. Numer. Anal. 60(5), 2731–2750 (2022). https://doi.org/10.1137/22M1487965 Northby, J.A.: Structure and binding of Lennard-Jones clusters: \(13\leqslant n \leqslant 147\). J. Chem. Phys. (1987). https://doi.org/10.1063/1.453492 Ratanov, N.: Option pricing model based on a Markov-modulated diffusion with jumps. Braz. J. Probab. Stat. 24(2), 413–431 (2010). https://doi.org/10.1214/09-BJPS037 Samanta, A., Weinan, E.: Atomistic simulations of rare events using gentlest ascent dynamics. J. Chem. Phys. (2012). https://doi.org/10.1063/1.3692803 Shirikyan, A.: Controllability implies mixing i. Convergence in the total variation metric. Russ. Math. Surv. (2017). https://doi.org/10.1070/RM9755 Stroock, D.W., Varadhan, S.R.S.: Multidimensional diffusion processes. In: Classics in Mathematics. Springer-Verlag, Berlin, reprint of the 1997 edition (2006)