On the Role of Initialization on the Implicit Bias in Deep Linear Networks
CoRR(2024)
摘要
Despite Deep Learning's (DL) empirical success, our theoretical understanding
of its efficacy remains limited. One notable paradox is that while conventional
wisdom discourages perfect data fitting, deep neural networks are designed to
do just that, yet they generalize effectively. This study focuses on exploring
this phenomenon attributed to the implicit bias at play. Various sources of
implicit bias have been identified, such as step size, weight initialization,
optimization algorithm, and number of parameters. In this work, we focus on
investigating the implicit bias originating from weight initialization. To this
end, we examine the problem of solving underdetermined linear systems in
various contexts, scrutinizing the impact of initialization on the implicit
regularization when using deep networks to solve such systems. Our findings
elucidate the role of initialization in the optimization and generalization
paradoxes, contributing to a more comprehensive understanding of DL's
performance characteristics.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要