The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure

arXiv: Learning, 2019.

Cited by: 15|Bibtex|Views44|Links
EI

Abstract:

There is a stark disparity between the step size schedules used in practical large scale machine learning and those that are considered optimal by the theory of stochastic approximation. In theory, most results utilize polynomially decaying learning rate schedules, while, in practice, the Step Decay schedule is among the most popular sche...More

Code:

Data:

Your rating :
0

 

Tags
Comments