Towards more accurate and useful data anonymity vulnerability measures
arxiv(2024)
摘要
The purpose of anonymizing structured data is to protect the privacy of
individuals in the data while retaining the statistical properties of the data.
There is a large body of work that examines anonymization vulnerabilities.
Focusing on strong anonymization mechanisms, this paper examines a number of
prominent attack papers and finds several problems, all of which lead to
overstating risk. First, some papers fail to establish a correct statistical
inference baseline (or any at all), leading to incorrect measures. Notably, the
reconstruction attack from the US Census Bureau that led to a redesign of its
disclosure method made this mistake. We propose the non-member framework, an
improved method for how to compute a more accurate inference baseline, and give
examples of its operation.
Second, some papers don't use a realistic membership base rate, leading to
incorrect precision measures if precision is reported. Third, some papers
unnecessarily report measures in such a way that it is difficult or impossible
to assess risk. Virtually the entire literature on membership inference
attacks, dozens of papers, make one or both of these errors. We propose that
membership inference papers report precision/recall values using a
representative range of base rates.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要