Role of Locality and Weight Sharing in Image-Based Tasks: A Sample Complexity Separation between CNNs, LCNs, and FCNs

Aakash Lahoti,Stefani Karp, Ezra Winston, Aarti Singh,Yuanzhi Li

ICLR 2024(2024)

引用 0|浏览6
暂无评分
摘要
Vision-based tasks are known to exhibit the properties of locality and translation invariance. The superior performance of convolutional neural networks (CNNs) on these tasks is attributed to the inductive bias of locality and weight sharing baked into their architecture. Existing attempts at quantifying the statistical benefits of these biases in CNNs over local convolutional neural networks (LCNs) and fully connected neural networks (FCNs) fall into one of the following categories: either do not establish a gap between the performance of these architectures, or ignore optimization considerations, or consider stylized settings that are not reflective of image-like tasks, particularly translation invariance. We introduce the Dynamic Signal Distribution (DSD), a data model that is designed to capture properties of real-world images such as locality and translation invariance. In DSD, each image is modeled with $k$ patches, with each patch of dimension $d$, and the label is determined by a $d$-sparse signal vector that can freely appear in any one of the $k$ patches. Under this task, we show that CNNs trained using gradient descent require $\tilde{O}(k+d)$ samples, whereas LCNs require $\Omega(kd)$ samples for predicting the label, establishing the statistical advantages of weight sharing in translation invariant tasks. Additionally, LCNs need $\tilde{O}(k(k+d))$ samples, compared to FCNs, which need $\Omega(k^2d)$ samples, showcasing the benefits of locality in local tasks.
更多
查看译文
关键词
Deep Learning Theory,Sample Complexity,Convolutional Neural Networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要