Minimum 𝓁-norm interpolators: Precise asymptotics and multiple descent.

ArXiv(2021)

引用 1|浏览0
暂无评分
摘要
An evolving line of machine learning works observe empirical evidence that suggests interpolating estimators — the ones that achieve zero training error — may not necessarily be harmful. This paper pursues theoretical understanding for an important type of interpolators: the minimum `1-norm interpolator, which is motivated by the observation that several learning algorithms favor low `1-norm solutions in the over-parameterized regime. Concretely, we consider the noisy sparse regression model under Gaussian design, focusing on linear sparsity and high-dimensional asymptotics (so that both the number of features and the sparsity level scale proportionally with the sample size). We observe, and provide rigorous theoretical justification for, a curious multi-descent phenomenon; that is, the generalization risk of the minimum `1-norm interpolator undergoes multiple (and possibly more than two) phases of descent and ascent as one increases the model capacity. This phenomenon stems from the special structure of the minimum `1-norm interpolator as well as the delicate interplay between the over-parameterized ratio and the sparsity, thus unveiling a fundamental distinction in geometry from the minimum `2-norm interpolator. Our finding is built upon an exact characterization of the risk behavior, which is governed by a system of two non-linear equations with two unknowns.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要