A Brief Introduction to Manifold Optimization

Journal of the Operations Research Society of China - Tập 8 Số 2 - Trang 199-248 - 2020
Jiang Hu1, Xin Liu2, Zaiwen Wen1, Ya-xiang Yuan2
1Beijing International Center for Mathematical Research, Peking University, Beijing 100871, China
2State Key Laboratory of Scientific and Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

Tóm tắt

Abstract

Manifold optimization is ubiquitous in computational and applied mathematics, statistics, engineering, machine learning, physics, chemistry, etc. One of the main challenges usually is the non-convexity of the manifold constraints. By utilizing the geometry of manifold, a large class of constrained optimization problems can be viewed as unconstrained optimization problems on manifold. From this perspective, intrinsic structures, optimality conditions and numerical algorithms for manifold optimization are investigated. Some recent progress on the theoretical results of manifold optimization is also presented.

Từ khóa


Tài liệu tham khảo

Lai, R., Wen, Z., Yin, W., Gu, X., Lui, L.M.: Folding-free global conformal mapping for genus-0 surfaces by harmonic energy minimization. J. Sci. Comput. 58, 705–725 (2014)

Schoen, R.M., Yau, S.-T.: Lectures on Harmonic Maps, vol. 2. American Mathematical Society, Providence (1997)

Simon, D., Abell, J.: A majorization algorithm for constrained correlation matrix approximation. Linear Algebra Appl. 432, 1152–1164 (2010)

Gao, Y., Sun, D.: A majorized penalty approach for calibrating rank constrained correlation matrix problems, tech. report, National University of Singapore (2010)

Waldspurger, I., d’Aspremont, A., Mallat, S.: Phase recovery, maxcut and complex semidefinite programming. Math. Program. 149, 47–81 (2015)

Cai, J.-F., Liu, H., Wang, Y.: Fast rank-one alternating minimization algorithm for phase retrieval. J. Sci. Comput. 79, 128–147 (2019)

Hu, J., Jiang, B., Liu, X., Wen, Z.: A note on semidefinite programming relaxations for polynomial optimization over a single sphere. Sci. China Math. 59, 1543–1560 (2016)

Singer, A., Shkolnisky, Y.: Three-dimensional structure determination from common lines in cryo-em by eigenvectors and semidefinite programming. SIAM J. Imaging Sci. 4, 543–572 (2011)

Liu, X., Wen, Z., Zhang, Y.: An efficient Gauss–Newton algorithm for symmetric low-rank product matrix approximations. SIAM J. Optim. 25, 1571–1608 (2015)

Liu, X., Wen, Z., Zhang, Y.: Limited memory block Krylov subspace optimization for computing dominant singular value decompositions. SIAM J. Sci. Comput. 35, A1641–A1668 (2013)

Wen, Z., Yang, C., Liu, X., Zhang, Y.: Trace-penalty minimization for large-scale eigenspace computation. J. Sci. Comput. 66, 1175–1203 (2016)

Wen, Z., Zhang, Y.: Accelerating convergence by augmented Rayleigh–Ritz projections for large-scale eigenpair computation. SIAM J. Matrix Anal. Appl. 38, 273–296 (2017)

Zhang, J., Wen, Z., Zhang, Y.: Subspace methods with local refinements for eigenvalue computation using low-rank tensor-train format. J. Sci. Comput. 70, 478–499 (2017)

Oja, E., Karhunen, J.: On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix. J. Math. Anal. Appl. 106, 69–84 (1985)

Shamir, O.: A stochastic PCA and SVD algorithm with an exponential convergence rate. Int. Conf. Mach. Learn. 144–152 (2015)

Li, C.J., Wang, M., Liu, H., Zhang, T.: Near-optimal stochastic approximation for online principal component estimation. Math. Program. 167, 75–97 (2018)

Pulay, P.: Convergence acceleration of iterative sequences. The case of SCF iteration. Chem. Phys. Lett. 73, 393–398 (1980)

Pulay, P.: Improved SCF convergence acceleration. J. Comput. Chem. 3, 556–560 (1982)

Toth, A., Ellis, J.A., Evans, T., Hamilton, S., Kelley, C., Pawlowski, R., Slattery, S.: Local improvement results for Anderson acceleration with inaccurate function evaluations. SIAM J. Sci. Comput. 39, S47–S65 (2017)

Zhang, X., Zhu, J., Wen, Z., Zhou, A.: Gradient type optimization methods for electronic structure calculations. SIAM J. Sci. Comput. 36, C265–C289 (2014)

Wen, Z., Milzarek, A., Ulbrich, M., Zhang, H.: Adaptive regularized self-consistent field iteration with exact Hessian for electronic structure calculation. SIAM J. Sci. Comput. 35, A1299–A1324 (2013)

Dai, X., Liu, Z., Zhang, L., Zhou, A.: A conjugate gradient method for electronic structure calculations. SIAM J. Sci. Comput. 39, A2702–A2740 (2017)

Zhao, Z., Bai, Z.-J., Jin, X.-Q.: A Riemannian Newton algorithm for nonlinear eigenvalue problems. SIAM J. Matrix Anal. Appl. 36, 752–774 (2015)

Zhang, L., Li, R.: Maximization of the sum of the trace ratio on the Stiefel manifold, II: computation. Sci. China Math. 58, 1549–1566 (2015)

Gao, B., Liu, X., Chen, X., Yuan, Y.: A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim. 28, 302–332 (2018)

Lai, R., Lu, J.: Localized density matrix minimization and linear-scaling algorithms. J. Comput. Phys. 315, 194–210 (2016)

Ulbrich, M., Wen, Z., Yang, C., Klockner, D., Lu, Z.: A proximal gradient method for ensemble density functional theory. SIAM J. Sci. Comput. 37, A1975–A2002 (2015)

Jiang, B., Liu, Y.-F., Wen, Z.: L\_p-norm regularization algorithms for optimization over permutation matrices. SIAM J. Optim. 26, 2284–2313 (2016)

Zhang, J., Liu, H., Wen, Z., Zhang, S.: A sparse completely positive relaxation of the modularity maximization for community detection. SIAM J. Sci. Comput. 40, A3091–A3120 (2018)

Cho, M., Lee, J.: Riemannian approach to batch normalization. Adv. Neural Inf. Process. Syst. 5225–5235 (2017). https://papers.nips.cc/paper/7107-riemannian-approach-to-batch-normalization.pdf

Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the lasso. J. Comput. Graph. Stat. 12, 531–547 (2003)

Wen, Z., Yin, W., Zhang, Y.: Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math. Program. Comput. 4, 333–361 (2012)

Vandereycken, B.: Low-rank matrix completion by Riemannian optimization. SIAM J. Optim. 23, 1214–1236 (2013)

Wei, K., Cai, J.-F., Chan, T.F., Leung, S.: Guarantees of Riemannian optimization for low rank matrix recovery. SIAM J. Matrix Anal. Appl. 37, 1198–1222 (2016)

Cambier, L., Absil, P.-A.: Robust low-rank matrix completion by Riemannian optimization. SIAM J. Sci. Comput. 38, S440–S460 (2016)

Zhang, Y., Lau, Y., Kuo, H.-w., Cheung, S., Pasupathy, A., Wright, J.: On the global geometry of sphere-constrained sparse blind deconvolution. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. 4894–4902 (2017)

Zass, R., Shashua, A.: Nonnegative sparse PCA. Adv. Neural Inf. Process. Syst. 1561–1568 (2007). https://papers.nips.cc/paper/3104-nonnegative-sparse-pca

Montanari, A., Richard, E.: Non-negative principal component analysis: message passing algorithms and sharp asymptotics. IEEE Trans. Inf. Theory 62, 1458–1484 (2016)

Carson, T., Mixon, D.G., Villar, S.: Manifold optimization for K-means clustering. Int. Conf. Sampl. Theory. Appl. SampTA 73–77. IEEE (2017)

Liu, H., Cai, J.-F., Wang, Y.: Subspace clustering by (k, k)-sparse matrix factorization. Inverse Probl. Imaging 11, 539–551 (2017)

Xie, T., Chen, F.: Non-convex clustering via proximal alternating linearized minimization method. Int. J. Wavelets Multiresolut. Inf. Process. 16, 1840013 (2018)

Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton, NJ (2008)

Absil, P.-A., Gallivan, K.A.: Joint diagonalization on the oblique manifold for independent component analysis. Proc. IEEE Int. Conf. Acoust. Speech Signal Process 5, 945–958 (2006)

Bhatia, R.: Positive Definite Matrices, vol. 24. Princeton University Press, Princeton (2009)

Journée, M., Bach, F., Absil, P.-A., Sepulchre, R.: Low-rank optimization on the cone of positive semidefinite matrices. SIAM J. Optim. 20, 2327–2351 (2010)

Massart, E., Absil, P.-A.: Quotient geometry with simple geodesics for the manifold of fixed-rank positive-semidefinite matrices. SIAM J. Matrix Anal. Appl. 41, 171–198 (2020)

Yang, W.H., Zhang, L.-H., Song, R.: Optimality conditions for the nonlinear programming problems on Riemannian manifolds. Pac. J. Optim. 10, 415–434 (2014)

Gabay, D.: Minimizing a differentiable function over a differential manifold. J. Optim. Theory Appl. 37, 177–219 (1982)

Smith, S.T.: Optimization techniques on Riemannian manifolds. Fields Institute Communications 3 (1994)

Kressner, D., Steinlechner, M., Vandereycken, B.: Low-rank tensor completion by Riemannian optimization. BIT Numer. Math. 54, 447–468 (2014)

Hu, J., Milzarek, A., Wen, Z., Yuan, Y.: Adaptive quadratically regularized Newton method for Riemannian optimization. SIAM J. Matrix Anal. Appl. 39, 1181–1207 (2018)

Absil, P.-A., Malick, J.: Projection-like retractions on matrix manifolds. SIAM J. Optim. 22, 135–158 (2012)

Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

Wen, Z., Yin, W.: A feasible method for optimization with orthogonality constraints. Math. Program. 142, 397–434 (2013)

Jiang, B., Dai, Y.-H.: A framework of constraint preserving update schemes for optimization on Stiefel manifold. Math. Program. 153, 535–575 (2015)

Zhu, X.: A Riemannian conjugate gradient method for optimization on the Stiefel manifold. Comput. Optim. Appl. 67, 73–110 (2017)

Siegel, J.W.: Accelerated optimization with orthogonality constraints, arXiv:1903.05204 (2019)

Iannazzo, B., Porcelli, M.: The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation. IMA J. Numer. Anal. 00, 1–23 (2017)

Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20, 303–353 (1999)

Nishimori, Y., Akaho, S.: Learning algorithms utilizing quasi-geodesic flows on the Stiefel manifold. Neurocomputing 67, 106–135 (2005)

Huang, W.: Optimization algorithms on Riemannian manifolds with applications, Ph.D. thesis, The Florida State University (2013)

Lezcano-Casado, M., Martínez-Rubio, D.: Cheap orthogonal constraints in neural networks: a simple parametrization of the orthogonal and unitary group, arXiv:1901.08428 (2019)

Li, J., Fuxin, L., Todorovic, S.: Efficient Riemannian optimization on the Stiefel manifold via the Cayley transform, Conference arXiv:2002.01113 (2020)

Huang, W., Gallivan, K.A., Absil, P.-A.: A Broyden class of quasi-Newton methods for Riemannian optimization. SIAM J. Optim. 25, 1660–1685 (2015)

Huang, W., Absil, P.-A., Gallivan, K.A.: Intrinsic representation of tangent vectors and vector transports on matrix manifolds. Numer. Math. 136, 523–543 (2017)

Hu, J., Milzarek, A., Wen, Z., Yuan, Y.: Adaptive regularized Newton method for Riemannian optimization, arXiv:1708.02016 (2017)

Zhang, H., Hager, W.W.: A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim. 14, 1043–1056 (2004)

Udriste, C.: Convex Functions and Optimization Methods on Riemannian Manifolds, vol. 297. Springer, Berlin (1994)

Absil, P.-A., Baker, C.G., Gallivan, K.A.: Trust-region methods on Riemannian manifolds. Found. Comput. Math. 7, 303–330 (2007)

Qi, C.: Numerical optimization methods on Riemannian manifolds, Ph.D. thesis, Florida State University (2011)

Ring, W., Wirth, B.: Optimization methods on Riemannian manifolds and their application to shape space. SIAM J. Optim. 22, 596–627 (2012)

Seibert, M., Kleinsteuber, M., Hüper, K.: Properties of the BFGS method on Riemannian manifolds. Mathematical System Theory C Festschrift in Honor of Uwe Helmke on the Occasion of his Sixtieth Birthday, pp. 395–412 (2013)

Huang, W., Absil, P.-A., Gallivan, K.A.: A Riemannian symmetric rank-one trust-region method. Math. Program. 150, 179–216 (2015)

Huang, W., Absil, P.-A., Gallivan, K.: A Riemannian BFGS method without differentiated retraction for nonconvex optimization problems. SIAM J. Optim. 28, 470–495 (2018)

Hu, J., Jiang, B., Lin, L., Wen, Z., Yuan, Y.-X.: Structured quasi-Newton methods for optimization with orthogonality constraints. SIAM J. Sci. Comput. 41, A2239–A2269 (2019)

Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, 2nd edn. Springer, New York (2006)

Wu, X., Wen, Z., Bao, W.: A regularized Newton method for computing ground states of Bose–Einstein condensates. J. Sci. Comput. 73, 303–329 (2017)

Kass, R.E.: Nonlinear regression analysis and its applications. J. Am. Stat. Assoc. 85, 594–596 (1990)

Sun, W., Yuan, Y.: Optimization Theory and Methods: Nonlinear Programming, vol. 1. Springer, Berlin (2006)

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436 (2015)

Bonnabel, S.: Stochastic gradient descent on Riemannian manifolds. IEEE Trans. Autom. Control. 58, 2217–2229 (2013)

Zhang, H., Sra, S.: First-order methods for geodesically convex optimization, In: Conference on Learning Theory, pp. 1617–1638 (2016)

Liu, Y., Shang, F., Cheng, J., Cheng, H., Jiao, L.: Accelerated first-order methods for geodesically convex optimization on Riemannian manifolds. Adv. Neural Inf. Process. Syst. 4868–4877 (2017)

Zhang, H., Reddi, S.J., Sra, S.: Riemannian SVRG: fast stochastic optimization on Riemannian manifolds. Adv. Neural Inf. Process. Syst. 4592–4600 (2016)

Sato, H., Kasai, H., Mishra, B.: Riemannian stochastic variance reduced gradient algorithm with retraction and vector transport. SIAM J. Optim. 29, 1444–1472 (2019)

Jiang, B., Ma, S., So, A.M.-C., Zhang, S.: Vector transport-free svrg with general retraction for Riemannian optimization: Complexity analysis and practical implementation, arXiv:1705.09059 (2017)

Bécigneul, G., Ganea, O.-E.: Riemannian adaptive optimization methods, arXiv:1810.00760 (2018)

Dirr, G., Helmke, U., Lageman, C.: Nonsmooth Riemannian optimization with applications to sphere packing and grasping. In: Lagrangian and Hamiltonian Methods for Nonlinear Control 2006, pp. 29–45. Springer, Berlin (2007)

Borckmans, P.B., Selvan, S.E., Boumal, N., Absil, P.-A.: A Riemannian subgradient algorithm for economic dispatch with valve-point effect. J Comput. Appl. Math. 255, 848–866 (2014)

Hosseini, S.: Convergence of nonsmooth descent methods via Kurdyka–Lojasiewicz inequality on Riemannian manifolds, Hausdorff Center for Mathematics and Institute for Numerical Simulation, University of Bonn (2015). https://ins.uni-bonn.de/media/public/publication-media/8_INS1523.pdf

Grohs, P., Hosseini, S.: Nonsmooth trust region algorithms for locally Lipschitz functions on Riemannian manifolds. IMA J. Numer. Anal. 36, 1167–1192 (2015)

Hosseini, S., Uschmajew, A.: A Riemannian gradient sampling algorithm for nonsmooth optimization on manifolds. SIAM J. Optim. 27, 173–189 (2017)

Bacák, M., Bergmann, R., Steidl, G., Weinmann, A.: A second order nonsmooth variational model for restoring manifold-valued images. SIAM J. Sci. Comput. 38, A567–A597 (2016)

de Carvalho Bento, G., da Cruz Neto, J.X., Oliveira, P.R.: A new approach to the proximal point method: convergence on general Riemannian manifolds. J Optim. Theory Appl. 168, 743–755 (2016)

Bento, G., Neto, J., Oliveira, P.: Convergence of inexact descent methods for nonconvex optimization on Riemannian manifolds, arXiv:1103.4828 (2011)

Bento, G.C., Ferreira, O.P., Melo, J.G.: Iteration-complexity of gradient, subgradient and proximal point methods on Riemannian manifolds. J Optim. Theory Appl. 173, 548–562 (2017)

Chen, S., Ma, S., So, A.M.-C., Zhang, T.: Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30, 210–239 (2019)

Xiao, X., Li, Y., Wen, Z., Zhang, L.: A regularized semi-smooth Newton method with projection steps for composite convex programs. J. Sci. Comput. 76, 1–26 (2018)

Huang, W., Wei, K.: Riemannian proximal gradient methods, arXiv:1909.06065 (2019)

Chen, S., Ma, S., Xue, L., Zou, H.: An alternating manifold proximal gradient method for sparse PCA and sparse cca, arXiv:1903.11576 (2019)

Huang, W., Wei, K.: Extending FISTA to Riemannian optimization for sparse PCA, arXiv:1909.05485 (2019)

Lai, R., Osher, S.: A splitting method for orthogonality constrained problems. J. Sci. Comput. 58, 431–449 (2014)

Kovnatsky, A., Glashoff, K., Bronstein, M.M.: Madmm: a generic algorithm for non-smooth optimization on manifolds. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision ECCV, pp. 680–696. Springer, Berlin (2016)

Wang, Y., Yin, W., Zeng, J.: Global convergence of admm in nonconvex nonsmooth optimization. J. Sci. Comput. 78, 29–63 (2019)

Zhang, J., Ma, S., Zhang, S.: Primal-dual optimization algorithms over Riemannian manifolds: an iteration complexity analysis, arXiv:1710.02236 (2017)

Birgin, E.G., Haeser, G., Ramos, A.: Augmented lagrangians with constrained subproblems and convergence to second-order stationary points. Comput. Optim. Appl. 69, 51–75 (2018)

Liu, C., Boumal, N.: Simple algorithms for optimization on Riemannian manifolds with constraints, arXiv:1901.10000 (2019)

Boumal, N., Absil, P.-A., Cartis, C.: Global rates of convergence for nonconvex optimization on manifolds. IMA J. Numer. Anal. 39, 1–33 (2018)

Zhang, J., Zhang, S.: A cubic regularized Newton’s method over Riemannian manifolds, arXiv:1805.05565 (2018)

Agarwal, N., Boumal, N., Bullins, B., Cartis, C.: Adaptive regularization with cubics on manifolds with a first-order analysis, arXiv:1806.00065 (2018)

Vishnoi, N.K.: Geodesic convex optimization: differentiation on manifolds, geodesics, and convexity, arXiv:1806.06373 (2018)

Liu, X., Wang, X., Wen, Z., Yuan, Y.: On the convergence of the self-consistent field iteration in Kohn–Sham density functional theory. SIAM J. Matrix Anal. Appl. 35, 546–558 (2014)

Liu, X., Wen, Z., Wang, X., Ulbrich, M., Yuan, Y.: On the analysis of the discretized Kohn-Sham density functional theory. SIAM J. Numer. Anal. 53, 1758–1785 (2015)

Cai, Y., Zhang, L.-H., Bai, Z., Li, R.-C.: On an eigenvector-dependent nonlinear eigenvalue problem. SIAM J. Matrix Anal. Appl. 39, 1360–1382 (2018)

Bai, Z., Lu, D., Vandereycken, B.: Robust Rayleigh quotient minimization and nonlinear eigenvalue problems. SIAM J. Sci. Comput. 40, A3495–A3522 (2018)

Yuan, H., Gu, X., Lai, R., Wen, Z.: Global optimization with orthogonality constraints via stochastic diffusion on manifold. J. Sci. Comput. 80, 1139–1170 (2019)

Barvinok, A.I.: Problems of distance geometry and convex properties of quadratic maps. Discrete Comput. Geom. 13, 189–202 (1995)

Pataki, G.: On the rank of extreme matrices in semidefinite programs and the multiplicity of optimal eigenvalues. Math. Oper. Res. 23, 339–358 (1998)

Burer, S., Monteiro, R.D.: A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Math. Program. 95, 329–357 (2003)

Boumal, N., Voroninski, V., Bandeira, A.: The non-convex Burer–Monteiro approach works on smooth semidefinite programs. In: Advances in Neural Information Processing Systems, pp. 2757–2765 (2016). https://papers.nips.cc/paper/6517-the-non-convex-burer-monteiro-approach-works-on-smooth-semidefinite-programs.pdf

Mei, S., Misiakiewicz, T., Montanari, A., Oliveira, R.I.: Solving SDPs for synchronization and maxcut problems via the Grothendieck inequality, arXiv:1703.08729 (2017)

Bandeira, A.S., Boumal, N., Voroninski, V.: On the low-rank approach for semidefinite programs arising in synchronization and community detection. Conf. Learn. Theor. 361–382 (2016)

Burer, S., Monteiro, R.D.: Local minima and convergence in low-rank semidefinite programming. Math. Program. 103, 427–444 (2005)

Boumal, N., Voroninski, V., Bandeira, A.S.: Deterministic guarantees for Burer–Monteiro factorizations of smooth semidefinite programs, arXiv:1804.02008 (2018)

Bandeira, A.S., Kennedy, C., Singer, A.: Approximating the little Grothendieck problem over the orthogonal and unitary groups. Math. Program. 160, 433–475 (2016)