Locally Optimal Descent for Dynamic Stepsize Scheduling.
CoRR(2023)
摘要
We introduce a novel dynamic learning-rate scheduling scheme grounded in
theory with the goal of simplifying the manual and time-consuming tuning of
schedules in practice. Our approach is based on estimating the locally-optimal
stepsize, guaranteeing maximal descent in the direction of the stochastic
gradient of the current step. We first establish theoretical convergence bounds
for our method within the context of smooth non-convex stochastic optimization,
matching state-of-the-art bounds while only assuming knowledge of the
smoothness parameter. We then present a practical implementation of our
algorithm and conduct systematic experiments across diverse datasets and
optimization algorithms, comparing our scheme with existing state-of-the-art
learning-rate schedulers. Our findings indicate that our method needs minimal
tuning when compared to existing approaches, removing the need for auxiliary
manual schedules and warm-up phases and achieving comparable performance with
drastically reduced parameter tuning.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要