Making dropout invariant to transformations of activation functions and inputs

NIPS 2014 Workshop on Deep Learning（2014）

引用 1|浏览8

暂无评分

摘要

The dropout learning algorithm randomly sets the activities of hidden units and inputs to zero during training. While highly successful, dropout is not invariant to transformations of activation functions and inputs, raising the question of whether it is suboptimal. To eliminate this arbitrary dependence, we introduce ‘invariant dropout’, which has one extra parameter for each input or hidden unit, that translates the activity so that the dropout value of zero is not tied to a particular activation level. The invariant dropout learning algorithm is very simple to implement and is computationally efficient. We show that invariant dropout can be used to achieve consistently lower error rates than regular dropout on the MNIST, CIFAR-10, SVHN and MAS datasets. To explore whether invariant dropout can be successfully combined with other methods, we used it to train maxout networks and again observed reductions in error rates. Interestingly, invariant dropout can also be viewed as minimizing unnecessary variance in the training cost function, equipping each hidden unit with a different input bias for each network in a Bayesian ensemble, and using a population average as the dropout level so as to prevent extreme co-adaptation.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要