Large Language Models are Advanced Anonymizers
CoRR(2024)
摘要
Recent work in privacy research on large language models has shown that they
achieve near human-level performance at inferring personal data from real-world
online texts. With consistently increasing model capabilities, existing text
anonymization methods are currently lacking behind regulatory requirements and
adversarial threats. This raises the question of how individuals can
effectively protect their personal data in sharing online texts. In this work,
we take two steps to answer this question: We first present a new setting for
evaluating anonymizations in the face of adversarial LLMs inferences, allowing
for a natural measurement of anonymization performance while remedying some of
the shortcomings of previous metrics. We then present our LLM-based adversarial
anonymization framework leveraging the strong inferential capabilities of LLMs
to inform our anonymization procedure. In our experimental evaluation, we show
on real-world and synthetic online texts how adversarial anonymization
outperforms current industry-grade anonymizers both in terms of the resulting
utility and privacy.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要