Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

    ICLR, Volume abs/1701.06538, 2017.

    Cited by: 407|Bibtex|Views148|Links
    EI

    Abstract:

    The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a per-example basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there a...More

    Code:

    Data:

    Your rating :
    0

     

    Tags
    Comments