h-detach: Modifying the LSTM Gradient Towards Better Optimization

    Bhargav Kanuparthi
    Bhargav Kanuparthi
    Giancarlo Kerg
    Giancarlo Kerg

    arXiv: Machine Learning, Volume abs/1810.03023, 2018.

    Cited by: 0|Bibtex|Views12|Links
    EI

    Abstract:

    Recurrent neural networks are known for their notorious exploding and vanishing gradient problem (EVGP). This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because EVGP prevents important gradient components from being back-propagated adequately over a large number ...More

    Code:

    Data:

    Your rating :
    0

     

    Tags
    Comments