Surrogate dropout: Learning optimal drop rate through proxy

Knowledge-Based Systems(2020)

引用 1|浏览62
暂无评分
摘要
Dropout is commonly used in deep neural networks to alleviate the problem of overfitting. Conventionally, the neurons in a layer indiscriminately share a fixed drop probability, which results in difficulty in determining the appropriate value for different tasks. Moreover, this static strategy will also incur serious degradation on performance when the conventional dropout is extensively applied to both shallow and deep layers. A question is whether selectively dropping the neurons would realize a better regularization effect. This paper proposes a simple and effective surrogate dropout method whereby neurons are dropped according to their importance. The proposed method has two main stages. The first stage trains a surrogate module that can be jointly optimized along with the neural network to evaluate the importance of each neuron. In the second stage, the output of the surrogate module is regarded as a guidance signal for dropping certain neurons, approximating the optimal per-neuron drop rate when the network converges. Various convolutional neural network architectures and multiple datasets, including CIFAR-10, CIFAR-100, SVHN, Tiny ImageNet, and two medical image datasets are used to evaluate the surrogate dropout method. The experimental results demonstrate that the proposed method achieves a better regularization effect than the baseline methods.
更多
查看译文
关键词
Deep neural networks,Dropout,Regularization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要