The complexity of estimating Rényi entropy
SODA(2015)
摘要
It was recently shown that estimating the Shannon entropy H(p) of a discrete k-symbol distribution p requires Θ(k/log k) samples, a number that grows near-linearly in the support size. In many applications H(p) can be replaced by the more general Rényi entropy of order α, Hα(p). We determine the number of samples needed to estimate Hα(p) for all α, showing that α < 1 requires super-linear, roughly k1/α samples, noninteger α > 1 requires near-linear, roughly k samples, but integer α > 1 requires only Θ(k1-1/α) samples. In particular, estimating H2(p), which arises in security, DNA reconstruction, closeness testing, and other applications, requires only Θ([EQUATION]k) samples. The estimators achieving these bounds are simple and run in time linear in the number of samples.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络