Asynchrony begets Momentum, with an Application to Deep Learning

    Allerton, pp. 997-1004, 2016.

    Cited by: 36|Bibtex|Views27|Links
    EI

    Abstract:

    Asynchronous methods are widely used in deep learning, but have limited theoretical justification when applied to non-convex problems. We show that running stochastic gradient descent (SGD) in an asynchronous manner can be viewed as adding a momentum-like term to the SGD iteration. Our result does not assume convexity of the objective fun...More

    Code:

    Data:

    Your rating :
    0

     

    Tags
    Comments