Initial Exploration of Zero-Shot Privacy Utility Tradeoffs in Tabular Data Using GPT-4
arxiv(2024)
摘要
We investigate the application of large language models (LLMs), specifically
GPT-4, to scenarios involving the tradeoff between privacy and utility in
tabular data. Our approach entails prompting GPT-4 by transforming tabular data
points into textual format, followed by the inclusion of precise sanitization
instructions in a zero-shot manner. The primary objective is to sanitize the
tabular data in such a way that it hinders existing machine learning models
from accurately inferring private features while allowing models to accurately
infer utility-related attributes. We explore various sanitization instructions.
Notably, we discover that this relatively simple approach yields performance
comparable to more complex adversarial optimization methods used for managing
privacy-utility tradeoffs. Furthermore, while the prompts successfully obscure
private features from the detection capabilities of existing machine learning
models, we observe that this obscuration alone does not necessarily meet a
range of fairness metrics. Nevertheless, our research indicates the potential
effectiveness of LLMs in adhering to these fairness metrics, with some of our
experimental results aligning with those achieved by well-established
adversarial optimization techniques.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要