A Direct Sum Result for the Information Complexity of Learning.

COLT(2018)

引用 24|浏览15
暂无评分
摘要
How many bits of information are required to PAC learn a class of hypotheses of VC dimension $d$? mathematical setting we follow is that of Bassily et al. (2018), where the value of interest is the mutual information $mathrm{I}(S;A(S))$ between the input sample $S$ and the hypothesis outputted by the learning algorithm $A$. We introduce a class of functions of VC dimension $d$ over the domain $mathcal{X}$ with information complexity at least $Omegaleft(dlog log frac{|mathcal{X}|}{d}right)$ bits for any consistent and proper algorithm (deterministic or random). Bassily et al. proved a similar (but quantitatively weaker) result for the case $d=1$. The above result is in fact a special case of a more general phenomenon we explore. We define the notion of information complexity of a given class of functions $mathcal{H}$. Intuitively, it is the minimum amount of information that an algorithm for $mathcal{H}$ must retain about its input to ensure consistency and properness. We prove a direct sum result for information complexity in this context; roughly speaking, the information complexity sums when combining several classes.
更多
查看译文
关键词
information complexity,learning,direct sum result
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要