On approximating dropout noise injection.

CoRR(2019)

引用 0|浏览0
暂无评分
摘要
This paper examines the assumptions of the derived equivalence between dropout noise injection and $L_2$ regularisation for logistic regression with negative log loss. We show that the approximation method is based on a divergent Taylor expansion, making, subsequent work using this approximation to compare the dropout trained logistic regression model with standard regularisers, remains unfortunately ill-founded to date. Moreover, the approximation approach is shown to be invalid using any robust constraints. We show how this finding extends to general neural network topologies that use a cross-entropy prediction layer.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要