Average gradient outer product as a mechanism for deep neural collapse
CoRR(2024)
摘要
Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the
data representations in the final layers of Deep Neural Networks (DNNs). Though
the phenomenon has been measured in a wide variety of settings, its emergence
is only partially understood. In this work, we provide substantial evidence
that DNC formation occurs primarily through deep feature learning with the
average gradient outer product (AGOP). This takes a step further compared to
efforts that explain neural collapse via feature-agnostic approaches, such as
the unconstrained features model. We proceed by providing evidence that the
right singular vectors and values of the weights are responsible for the
majority of within-class variability collapse in DNNs. As shown in recent work,
this singular structure is highly correlated with that of the AGOP. We then
establish experimentally and theoretically that AGOP induces neural collapse in
a randomly initialized neural network. In particular, we demonstrate that Deep
Recursive Feature Machines, a method originally introduced as an abstraction
for AGOP feature learning in convolutional neural networks, exhibits DNC.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要