Collecting Preference Rankings Under Local Differential Privacy

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING(2023)

引用 16|浏览2
暂无评分
摘要
With the deep penetration of the Internet and mobile devices, preference rankings are being collected on a massive scale by diverse data collectors for various business demands. However, users' preference rankings in many applications are highly sensitive. Without proper privacy protection mechanisms, it either puts individual privacy in jeopardy or hampers business opportunities due to users' unwillingness to share their true rankings. In this paper, we initiate the study of collecting preference rankings under local differential privacy. The key technical challenge comes from the fact that the number of possible rankings could be large in practical settings, leading to excessive injected noise. To solve this problem, we present a novel approach SAFARI, whose main idea is to collect a set of distributions over small domains which are carefully chosen based on the riffle independent (RI) model to approximate the overall distribution of users' rankings, and then generate a synthetic ranking dataset from the obtained distributions. By working on small domains instead of a large domain, SAFARI can significantly reduce the magnitude of added noise. In SAFARI, we design two transformation rules, namely Rule I and Rule II, to instruct users to transform their data to provide the information about the distributions of the small domains. In particular, we propose a method called LADE to precisely estimate the required distributions used for the structure learning of RI model. We also propose a new LDP method called SAFA for frequency estimation over multiple attributes that have small domains. We formally prove that SAFARI guarantees $\varepsilon$e-local differential privacy. Extensive experiments on real datasets confirm the effectiveness of SAFARI.
更多
查看译文
关键词
Data models,Differential privacy,Transforms,Estimation,Machine learning,Frequency estimation,Business,Preference rankings,data collection,riffle independent model,local differential privacy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要