Convex Formulation of Overparameterized Deep Neural Networks
IEEE Transactions on Information Theory(2022)
摘要
The analysis of over-parameterized neural networks has drawn significant attention in recent years. It was shown that such systems behave like convex systems under various restricted settings, such as for two-layer neural networks, and when learning is only restricted locally in the so-called neural tangent kernel space around specialized initializations. However, there is a lack of powerful theoretical techniques that can analyze fully trained deep neural networks under general conditions. This paper considers this fundamental problem by investigating such overparameterized deep neural networks when fully trained. Specifically, we characterize a deep neural network by its features’ distributions and propose a metric to intuitively measure the usefulness of feature representations. Under certain regularizers that bounds the metric, we show deep neural networks can be reformulated as a
convex
optimization and the system can guarantee effective feature representations in terms of the metric. Our new analysis is more consistent with empirical observations that deep neural networks are capable of learning efficient feature representations. Empirical studies confirm that predictions of our theory are consistent with results observed in practice.
更多查看译文
关键词
Deep learning,convex reformulation,feature representation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络