Information Complexity of Stochastic Convex Optimization: Applications to Generalization and Memorization
CoRR(2024)
摘要
In this work, we investigate the interplay between memorization and learning
in the context of stochastic convex optimization (SCO). We define
memorization via the information a learning algorithm reveals about its
training data points. We then quantify this information using the framework of
conditional mutual information (CMI) proposed by Steinke and Zakynthinou
(2020). Our main result is a precise characterization of the tradeoff between
the accuracy of a learning algorithm and its CMI, answering an open question
posed by Livni (2023). We show that, in the L^2 Lipschitz–bounded setting
and under strong convexity, every learner with an excess error ε
has CMI bounded below by Ω(1/ε^2) and Ω(1/ε),
respectively. We further demonstrate the essential role of memorization in
learning problems in SCO by designing an adversary capable of accurately
identifying a significant fraction of the training samples in specific SCO
problems. Finally, we enumerate several implications of our results, such as a
limitation of generalization bounds based on CMI and the incompressibility of
samples in SCO problems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要