Cross Entropy versus Label Smoothing: A Neural Collapse Perspective
CoRR(2024)
摘要
Label smoothing loss is a widely adopted technique to mitigate overfitting in
deep neural networks. This paper studies label smoothing from the perspective
of Neural Collapse (NC), a powerful empirical and theoretical framework which
characterizes model behavior during the terminal phase of training. We first
show empirically that models trained with label smoothing converge faster to
neural collapse solutions and attain a stronger level of neural collapse.
Additionally, we show that at the same level of NC1, models under label
smoothing loss exhibit intensified NC2. These findings provide valuable
insights into the performance benefits and enhanced model calibration under
label smoothing loss. We then leverage the unconstrained feature model to
derive closed-form solutions for the global minimizers for both loss functions
and further demonstrate that models under label smoothing have a lower
conditioning number and, therefore, theoretically converge faster. Our study,
combining empirical evidence and theoretical results, not only provides nuanced
insights into the differences between label smoothing and cross-entropy losses,
but also serves as an example of how the powerful neural collapse framework can
be used to improve our understanding of DNNs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要