Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go?EI

    Cited by: 14|Bibtex|35|

    ICML, pp. 5965-5974, 2018.

    Abstract:

    One of the most widely used training methods for large-scale machine learning problems is distributed asynchronous stochastic gradient descent (DASGD). However, a key issue in its implementation is that of delays: when a worker node asynchronously contributes a gradient update to the master, the global model parameter may have changed, re...More
    Your rating :
    0

     

    Tags
    Comments