Sample-Optimal Locally Private Hypothesis Selection and the Provable Benefits of Interactivity
CoRR(2023)
摘要
We study the problem of hypothesis selection under the constraint of local
differential privacy. Given a class $\mathcal{F}$ of $k$ distributions and a
set of i.i.d. samples from an unknown distribution $h$, the goal of hypothesis
selection is to pick a distribution $\hat{f}$ whose total variation distance to
$h$ is comparable with the best distribution in $\mathcal{F}$ (with high
probability). We devise an $\varepsilon$-locally-differentially-private
($\varepsilon$-LDP) algorithm that uses $\Theta\left(\frac{k}{\alpha^2\min
\{\varepsilon^2,1\}}\right)$ samples to guarantee that $d_{TV}(h,\hat{f})\leq
\alpha + 9 \min_{f\in \mathcal{F}}d_{TV}(h,f)$ with high probability. This
sample complexity is optimal for $\varepsilon<1$, matching the lower bound of
Gopi et al. (2020). All previously known algorithms for this problem required
$\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$
samples to work.
Moreover, our result demonstrates the power of interaction for
$\varepsilon$-LDP hypothesis selection. Namely, it breaks the known lower bound
of $\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$ for
the sample complexity of non-interactive hypothesis selection. Our algorithm
breaks this barrier using only $\Theta(\log \log k)$ rounds of interaction.
To prove our results, we define the notion of \emph{critical queries} for a
Statistical Query Algorithm (SQA) which may be of independent interest.
Informally, an SQA is said to use a small number of critical queries if its
success relies on the accuracy of only a small number of queries it asks. We
then design an LDP algorithm that uses a smaller number of critical queries.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要