BiAdam: Fast Adaptive Bilevel Optimization Methods

ICLR 2023(2023)

引用 6|浏览14
暂无评分
摘要
Bilevel optimization recently has attracted increased interest in machine learning due to its many applications such as hyper-parameter optimization and meta learning. Although many bilevel optimization methods recently have been proposed, these methods do not consider using adaptive learning rates. It is well known that adaptive learning rates can accelerate many optimization algorithms including (stochastic) gradient-based algorithms. To fill this gap, in the paper, we propose a novel fast adaptive bilevel framework to solve stochastic bilevel optimization problems that the outer problem is possibly nonconvex and the inner problem is strongly convex. Our framework uses unified adaptive matrices including many types of adaptive learning rates, and can flexibly use the momentum and variance reduced techniques. In particular, we provide a useful convergence analysis framework for the bilevel optimization. Specifically, we propose a fast single-loop adaptive bilevel optimization (BiAdam) algorithm based on the basic momentum technique, which achieves a sample complexity of $\tilde{O}(\epsilon^{-4})$ for finding an $\epsilon$-stationary solution (i.e., $\mathbb{E}\|\nabla F(x)\| \leq \epsilon$ or its equivalent variants). Meanwhile, we propose an accelerated version of BiAdam algorithm (VR-BiAdam) by using variance reduced technique, which reaches the best known sample complexity of $\tilde{O}(\epsilon^{-3})$ without relying on large batch-size. To the best of our knowledge, we first study the adaptive bilevel optimization methods with adaptive learning rates. Some experimental results on data hyper-cleaning and hyper-representation learning tasks demonstrate the efficiency of our algorithms.
更多
查看译文
关键词
Bilevel Optimization,Momentum,Adaptive Learning Rate,Variance Reduced,Hyper-Parameter Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要