Why Does Differential Privacy with Large Epsilon Defend Against Practical Membership Inference Attacks?
CoRR(2024)
摘要
For small privacy parameter ϵ, ϵ-differential privacy (DP)
provides a strong worst-case guarantee that no membership inference attack
(MIA) can succeed at determining whether a person's data was used to train a
machine learning model. The guarantee of DP is worst-case because: a) it holds
even if the attacker already knows the records of all but one person in the
data set; and b) it holds uniformly over all data sets. In practical
applications, such a worst-case guarantee may be overkill: practical attackers
may lack exact knowledge of (nearly all of) the private data, and our data set
might be easier to defend, in some sense, than the worst-case data set. Such
considerations have motivated the industrial deployment of DP models with large
privacy parameter (e.g. ϵ≥ 7), and it has been observed
empirically that DP with large ϵ can successfully defend against
state-of-the-art MIAs. Existing DP theory cannot explain these empirical
findings: e.g., the theoretical privacy guarantees of ϵ≥ 7 are
essentially vacuous. In this paper, we aim to close this gap between theory and
practice and understand why a large DP parameter can prevent practical MIAs. To
tackle this problem, we propose a new privacy notion called practical
membership privacy (PMP). PMP models a practical attacker's uncertainty about
the contents of the private data. The PMP parameter has a natural
interpretation in terms of the success rate of a practical MIA on a given data
set. We quantitatively analyze the PMP parameter of two fundamental DP
mechanisms: the exponential mechanism and Gaussian mechanism. Our analysis
reveals that a large DP parameter often translates into a much smaller PMP
parameter, which guarantees strong privacy against practical MIAs. Using our
findings, we offer principled guidance for practitioners in choosing the DP
parameter.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要