## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Conic Descent and its Application to Memory-efficient Optimization over Positive Semidefinite Matrices

NIPS 2020, (2020): 8308-8317

EI

Keywords

Abstract

We present an extension of the conditional gradient method to problems whose feasible sets are convex cones. We provide a convergence analysis for the method and for variants with nonconvex objectives, and we extend the analysis to practical cases with effective line search strategies. For the specific case of the positive semidefinite co...More

Code:

Data:

Introduction

- The authors want to solve problems of the form minimize f (x) subject to x 2 K (1)

where K ⇢ Rn is a proper, convex cone and f is a convex, differentiable function which has no nonzero direction of recession in K, i.e., it eventually curves upward along any nonzero ray in K. - The authors' work shows that it is possible to directly solve such a problem using a modification of the conditional gradient algorithm that the authors call conic descent (CD).
- Because CD generates solutions to problems over the positive semidefinite cone as a sum of rank-1 matrices, the authors propose a memory-efficient version of CD in the spirit of [24] that allows them to track the iterate using randomized sketches.

Highlights

- We want to solve problems of the form minimize f (x) subject to x 2 K (1)

where K ⇢ Rn is a proper, convex cone and f is a convex, differentiable function which has no nonzero direction of recession in K, i.e., it eventually curves upward along any nonzero ray in K - Our work shows that it is possible to directly solve such a problem using a modification of the conditional gradient algorithm that we call conic descent (CD)
- CD picks a descent direction which is a conic combination of a step toward the origin and a direction in K
- We provide a memory-efficient modification to CD for the positive semidefinite cone based on randomized matrix sketches that is similar to [24] but does not require any prior bound on kX?k
- Positive semidefinite matrix completion we examine the numerical performance of CD with the greedy heuristic as we vary the rank r of its updates
- On the bottom right side of Figure 2, we show box plots of the difference between the final objective value obtained by CD with the greedy step and PBM (i.e., the vertical distance between the colored dot and the value obtained at 50,000 matrix multiplications

Results

- Backtracking line search variant The proof of Theorem 1 shows that three methods of choosing the step size at each iteration obtain the same O(1/k) rate of convergence.
- If f has a Lipschitz continuous gradient with respect to k · k with parameter L, conic descent with ✓k determined from a backtracking line search (Algorithm 3) generates feasible points xk such that f
- If f has a Lipschitz continuous gradient with respect to k · k with parameter L, conic descent with an inexact rescaling method that satisfies (3) and with an an inexact subproblem solution method that satisfies (4) generates feasible points xk such that f kx?k + Lkx?k2).
- The authors' proof can be found in Appendix D and is similar in spirit to that presented in [15] for the conditional gradient method with an opbjective which is nonconvex but has a Lipschitz continuous gradient, and the authors obtain the same O(1/ k) rate of convergence.
- In line 13, the authors find a low-rank update using a point found from running a descent method on the problem minimize t2R,U 2Rn⇥r f (G(t2Xk+1 + U U T )
- The bottom right image shows the reconstruction obtained from conic descent with a greedy step every 100 iterations.
- For r = 2 or r = 3, the authors see that CD without the greedy step obtains a smaller objective value than PBM for the same number of matrix multiplications.

Conclusion

- In all cases, the use of the greedy step in CD significantly improves on the performance of PBM because CD is guaranteed to converge to the optimal objective value.
- It appears that CD with greedy steps provides a way to fuse the two popular methods of dealing with memory-limited optimization over the positive semidefinite cone.
- On the bottom right side of Figure 2, the authors show box plots of the difference between the final objective value obtained by CD with the greedy step and PBM (i.e., the vertical distance between the colored dot and the value obtained at 50,000 matrix multiplications.

Funding

- Acknowledgments and Disclosure of Funding Andrew Naber gratefully acknowledges support from the Stanford Graduate Fellowship and the G.I
- Oliver Hinder acknowledges support from the Dantzig-Lieberman Operations Research Fellowship

Study subjects and analysis

guarantees: 3

Note that the rescaling and subproblem must be solved with increasing accuracy. Theorem 3 guarantees the same O(1/k) convergence, and its proof can be found in Appendix C. Theorem 3

Reference

- Akshay Agrawal, Robin Verschueren, Steven Diamond, and Stephen Boyd. A rewriting system for convex optimization problems. Journal of Control and Decision, 5(1):42–60, 2018.
- Srinadh Bhojanapalli, Anastasios Kyrillidis, and Sujay Sanghavi. Dropping convexity for faster semi-definite optimization. In Conference on Learning Theory, pages 530–582, 2016.
- Sébastien Bubeck et al. Convex optimization: Algorithms and complexity. Foundations and Trends R in Machine Learning, 8(3-4):231–357, 2015.
- Samuel Burer and Renato DC Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 95(2):329–357, 2003.
- Samuel Burer and Renato DC Monteiro. Local minima and convergence in low-rank semidefinite programming. Mathematical Programming, 103(3):427–444, 2005.
- Emmanuel J Candes, Yonina C Eldar, Thomas Strohmer, and Vladislav Voroninski. Phase retrieval via matrix completion. SIAM review, 57(2):225–251, 2015.
- Emmanuel J Candès and Benjamin Recht. Exact matrix completion via convex optimization. Foundations of Computational mathematics, 9(6):717, 2009.
- Steven Diamond and Stephen Boyd. CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83):1–5, 2016.
- Rong Ge, Jason D Lee, and Tengyu Ma. Matrix completion has no spurious local minimum. In Advances in Neural Information Processing Systems, pages 2973–2981, 2016.
- Nathan Halko, Per-Gunnar Martinsson, and Joel A Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM review, 53(2):217–288, 2011.
- Zhifeng Hao, Ganzhao Yuan, and Bernard Ghanem. Bilgo: Bilateral greedy optimization for large scale semidefinite programming. Neurocomputing, 127:247–257, 2014.
- Zaid Harchaoui, Anatoli Juditsky, and Arkadi Nemirovski. Conditional gradient algorithms for norm-regularized smooth convex optimization. Mathematical Programming, 152(1-2):75–112, 2015.
- Michel Journée, Francis Bach, P-A Absil, and Rodolphe Sepulchre. Low-rank optimization on the cone of positive semidefinite matrices. SIAM Journal on Optimization, 20(5):2327–2351, 2010.
- Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009.
- Simon Lacoste-Julien. Convergence rate of Frank-Wolfe for non-convex objectives. arXiv preprint arXiv:1607.00345, 2016.
- Sören Laue. A hybrid algorithm for convex semidefinite optimization. In Proceedings of the 29th International Coference on International Conference on Machine Learning, pages 1083–1090, 2012.
- Francesco Locatello, Michael Tschannen, Gunnar Rätsch, and Martin Jaggi. Greedy algorithms for cone constrained optimization with convergence guarantees. In Advances in Neural Information Processing Systems, pages 773–784, 2017.
- Brendan O’donoghue, Eric Chu, Neal Parikh, and Stephen Boyd. Conic optimization via operator splitting and homogeneous self-dual embedding. Journal of Optimization Theory and Applications, 169(3):1042–1068, 2016.
- Dohyung Park, Anastasios Kyrillidis, Constantine Caramanis, and Sujay Sanghavi. Finding low-rank solutions via nonconvex matrix factorization, efficiently and provably. SIAM Journal on Imaging Sciences, 11(4):2165–2204, 2018.
- Dohyung Park, Anastasios Kyrillidis, Constantine Carmanis, and Sujay Sanghavi. Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach. In Artificial Intelligence and Statistics, pages 65–74, 2017.
- Joel A Tropp, Alp Yurtsever, Madeleine Udell, and Volkan Cevher. Randomized single-view algorithms for low-rank matrix approximation. 2017.
- Mehrdad Yaghoobi, Di Wu, and Mike E Davies. Fast non-negative orthogonal matching pursuit. IEEE Signal Processing Letters, 22(9):1229–1233, 2015.
- Alp Yurtsever, Joel A Tropp, Olivier Fercoq, Madeleine Udell, and Volkan Cevher. Scalable semidefinite programming. arXiv preprint arXiv:1912.02949, 2019.
- Alp Yurtsever, Madeleine Udell, Joel Tropp, and Volkan Cevher. Sketchy decisions: Convex low-rank matrix optimization with optimal storage. In Artificial Intelligence and Statistics, pages 1188–1196, 2017.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn