Stochastically Controlled Compositional Gradient for Composition Problems

Liu Liu,Ji Liu,Cho-Jui Hsieh大牛学者,Dacheng Tao大牛学者

IEEE Transactions on Neural Networks and Learning Systems(2023)

引用 1|浏览69
We consider composition problems of the form $(1/n)\sum _{i= 1}^{n} F_{i} (1/m)\sum _{j = 1}^{m} G_{j}(x)$ , which are important for machine learning. Although gradient descent and stochastic gradient descent are straightforward solutions, the essential computation of $G (x)= (1/m)\sum _{j = 1}^{m}{G_{j}(x)}$ in each single iteration is expensive, let alone for large $m$ . In this article, we devise a stochastically controlled compositional gradient algorithm. Specifically, we introduce two variants of stochastically controlled technique to estimate the inner function $G(x)$ and the gradient of the objective function, respectively. The computational cost is largely reduced. However, the natural needs of two stochastic subsets ${\mathcal D}_{1}$ and ${\mathcal D}_{2}$ form direct barriers to guarantee the convergence of the algorithm, especially the theoretical proof of the convergence. To this end, we present a general convergence analysis by proving $|{\mathcal{ D}}_{1}|=\min \{1/\epsilon,m\}$ and $|{\mathcal{ D}}_{2}|=\min \{1/\epsilon,n \}$ , through which the proposed method significantly improve composition algorithms under low target accuracy (i.e., $1/\epsilon \ll m$ or $n$ ) in both strongly convex and nonconvex settings. Comprehensive experiments demonstrate the superiority of the proposed method over existing methods.
Composition problem,stochastic optimization,stochastically controlled gradient,variance reduction
AI 理解论文