Stochastic Gradient Descent Escapes Saddle Points Efficiently.

arXiv: Learning(2019)

引用 47|浏览137
暂无评分
摘要
This paper considers the perturbed stochastic gradient descent algorithm and shows that it finds $epsilon$-second order stationary points ($left|nabla f(x)right|leq epsilon$ and $nabla^2 f(x) succeq -sqrt{epsilon} mathbf{I}$) in $tilde{O}(d/epsilon^4)$ iterations, giving the first result that has linear dependence on dimension for this setting. For the special case, where stochastic gradients are Lipschitz, the dependence on dimension reduces to polylogarithmic. In addition to giving new results, this paper also presents a simplified proof strategy that gives a shorter and more elegant proof of previously known results (Jin et al. 2017) on perturbed gradient descent algorithm.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要