Quadratic models for understanding catapult dynamics of neural networks
arxiv(2022)
摘要
While neural networks can be approximated by linear models as their width
increases, certain properties of wide neural networks cannot be captured by
linear models. In this work we show that recently proposed Neural Quadratic
Models can exhibit the "catapult phase" [Lewkowycz et al. 2020] that arises
when training such models with large learning rates. We then empirically show
that the behaviour of neural quadratic models parallels that of neural networks
in generalization, especially in the catapult phase regime. Our analysis
further demonstrates that quadratic models can be an effective tool for
analysis of neural networks.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要