Recursive decomposition for nonconvex optimization
IJCAI'15 Proceedings of the 24th International Conference on Artificial Intelligence, (2016): 253-259
Continuous optimization is an important problem in many areas of AI, including vision, robotics, probabilistic inference, and machine learning. Unfortunately, most real-world optimization problems are nonconvex, causing standard convex techniques to find only local optima, even with extensions like random restarts and simulated annealing....More
PPT (Upload PPT)
- AI systems that interact with the real world often have to solve continuous optimization problems.
- Most continuous optimization problems in AI and related fields are nonconvex, and often have an exponential number of local optima.
- The authors propose a novel nonconvex optimization algorithm, which uses recursive decomposition to handle the hard combinatorial core of the problem, leaving a set of simpler subproblems that can be solved using standard continuous optimizers
- AI systems that interact with the real world often have to solve continuous optimization problems
- We show that Recursive Decomposition into locally Independent Subspaces achieves an exponential speedup versus traditional techniques for nonconvex optimization such as gradient descent with restarts and grid search
- We evaluated Recursive Decomposition into locally Independent Subspaces on three difficult nonconvex optimization problems with hundreds to thousands of variables: structure from motion, a high-dimensional sinusoid, and protein folding
- We ran Recursive Decomposition into locally Independent Subspaces with a fixed number of restarts at each level, not guaranteeing that we found the global minimum
- Since cameras interact explicitly with points, creating a bipartite graph structure that Recursive Decomposition into locally Independent Subspaces can decompose, but local structure does not exist because the bounds on each term are too wide and tend to include ∞
- This paper proposed a new approach to solving hard nonconvex optimization problems based on recursive decomposition
- The authors evaluated RDIS on three difficult nonconvex optimization problems with hundreds to thousands of variables: structure from motion, a high-dimensional sinusoid, and protein folding.
- Structure from motion is the problem of reconstructing the geometry of a 3-D scene from a set of 2-D images of that scene
- It consists of first determining an initial estimate of the parameters and performing nonlinear optimization to minimize the squared error between a set of 2-D image points and a projection of the 3-D points onto camera models [Triggs et al, 2000].
- The dataset used is the 49-camera, 7776-point data file from the Ladybug dataset [Agarwal et al, 2010]
- This paper proposed a new approach to solving hard nonconvex optimization problems based on recursive decomposition.
- RDIS decomposes the function into approximately locally independent sub-functions and optimizes these separately by recursing on them.
- This results in an exponential reduction in the time required to find the global optimum.
- Directions for future research include applying RDIS to a wide variety of nonconvex optimization problems, further analyzing its theoretical properties, developing new variable and value selection methods, extending RDIS to handle hard constraints, incorporating discrete variables, and using similar ideas for high-dimensional integration
- This research was partly funded by ARO grant W911NF-081-0242, ONR grants N00014-13-1-0720 and N00014-12-10312, and AFRL contract FA8750-13-2-0019
- [Agarwal et al., 2010] Sameer Agarwal, Noah Snavely, Steven M. Seitz, and Richard Szeliski. Bundle adjustment in the large. In Kostas Daniilidis, Petros Maragos, and Nikos Paragios, editors, Computer Vision ECCV 2010, volume 6312 of Lecture Notes in Computer Science, pages 29–42. Springer Berlin Heidelberg, 2010.
- [Anfinsen, 1973] Christian B. Anfinsen. Principles that govern the folding of protein chains. Science, 181(4096):223– 230, 1973.
- [Bacchus et al., 2009] Fahiem Bacchus, Shannon Dalmao, and Toniann Pitassi. Solving #SAT and Bayesian Inference with Backtracking Search. Journal of Artificial Intelligence Research, 34:391–442, 2009.
- [Baker, 2000] David Baker. A surprising simplicity to protein folding. Nature, 405:39–42, 2000.
- [Bayardo Jr. and Pehoushek, 2000] Roberto J. Bayardo Jr. and Joseph Daniel Pehoushek. Counting models using connected components. In Proceedings of the Seventeenth National Conference on Artificial Intelligence, pages 157– 162, 2000.
- [Berman et al., 2000] Helen M. Berman, John Westbrook, Zukang Feng, Gary Gilliland, T. N. Bhat, Helge Weissig, Ilya N. Shindyalov, and Philip E. Bourne. The protein data bank. Nucleic Acids Research, 28(1):235–242, 2000.
- [Bonettini, 2011] Silvia Bonettini. Inexact block coordinate descent methods with application to non-negative matrix factorization. IMA Journal of Numerical Analysis, 31(4):1431–1452, 2011.
- [Catalyurek and Aykanat, 2011] Umit Catalyurek and Cevdet Aykanat. PaToH (partitioning tool for hypergraphs). In David Padua, editor, Encyclopedia of Parallel Computing, pages 1479–1487.
- Springer US, 2011.
- [Darwiche and Hopkins, 2001] Adnan Darwiche and Mark Hopkins. Using recursive decomposition to construct elimination orders, jointrees, and dtrees. In Symbolic and Quantitative Approaches to Reasoning with Uncertainty, pages 180–191.
- [Darwiche, 2001] Adnan Darwiche. Recursive conditioning. Artificial Intelligence, 126(1-2):5–41, 2001.
- [Davis et al., 1962] Martin Davis, George Logemann, and Donald Loveland. A machine program for theoremproving. Communications of the ACM, 5(7):394–397, 1962.
- [De Moura and Bjørner, 2011] Leonardo De Moura and Nikolaj Bjørner. Satisfiability modulo theories: Introduction and applications. Communications of the ACM, 54(9):69–77, September 2011.
- [Griewank and Toint, 1981] Andreas Griewank and Philippe L. Toint. On the unconstrained optimization of partially separable functions. Nonlinear Optimization, 1982:247–265, 1981.
- [Grippo and Sciandrone, 1999] Luigi Grippo and Marco Sciandrone. Globally convergent block-coordinate techniques for unconstrained optimization. Optimization Methods and Software, 10(4):587–637, 1999.
- [Hansen and Walster, 2003] Eldon Hansen and G. William Walster. Global optimization using interval analysis: revised and expanded, volume 264. CRC Press, 2003.
- [Holm et al., 2001] Jacob Holm, Kristian De Lichtenberg, and Mikkel Thorup. Poly-logarithmic deterministic fullydynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. Journal of the ACM (JACM), 48(4):723–760, 2001.
- [Lourakis, 2004] M.I.A. Lourakis. levmar: LevenbergMarquardt nonlinear least squares algorithms in C/C++. http://www.ics.forth.gr/∼lourakis/levmar/, 2004.
- [Moskewicz et al., 2001] Matthew W. Moskewicz, Conor F. Madigan, Ying Zhao, Lintao Zhang, and Sharad Malik. Chaff: Engineering an efficient sat solver. In Proceedings of the 38th annual Design Automation Conference, pages 530–535. ACM, 2001.
- [Neumaier et al., 2005] Arnold Neumaier, Oleg Shcherbina, Waltraud Huyer, and Tams Vink. A comparison of complete global optimization solvers. Mathematical Programming, 103(2):335–356, 2005.
- [Nocedal and Wright, 2006] Jorge Nocedal and Stephen J Wright. Numerical Optimization. Springer, 2006.
- [Sang et al., 2004] Tian Sang, Fahiem Bacchus, Paul Beame, Henry A. Kautz, and Toniann Pitassi. Combining component caching and clause learning for effective model counting. Seventh International Conference on Theory and Applications of Satisfiability Testing, 2004.
- [Sang et al., 2005] Tian Sang, Paul Beame, and Henry Kautz. Performing Bayesian inference by weighted model counting. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-05), volume 1, pages 475–482, 2005.
- [Schoen, 1991] Fabio Schoen. Stochastic techniques for global optimization: A survey of recent advances. Journal of Global Optimization, 1(3):207–228, 1991.
- [Triggs et al., 2000] Bill Triggs, Philip F. McLauchlan, Richard I. Hartley, and Andrew W. Fitzgibbon. Bundle adjustment – a modern synthesis. In Vision Algorithms: Theory and Practice, pages 298–372.
- [Tseng and Yun, 2009] Paul Tseng and Sangwoon Yun. A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming, 117(12):387–423, 2009.
- [Yanover et al., 2006] Chen Yanover, Talya Meltzer, and Yair Weiss. Linear programming relaxations and belief propagation – an empirical study. The Journal of Machine Learning Research, 7:1887–1907, 2006.