Fairness and Cost Constrained Privacy-Aware Record Linkage

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY(2022)

引用 4|浏览30
暂无评分
摘要
Record linkage algorithms match and link records from different databases that refer to the same real-world entity based on direct and/or quasi-identifiers, such as name, address, age, and gender, available in the records. Since these identifiers generally contain personal identifiable information (PII) about the entities, record linkage algorithms need to be developed with privacy constraints. Known as privacy-preserving record linkage (PPRL), many research studies have been conducted to perform the linkage on encoded and/or encrypted identifiers. Differential privacy (DP) combined with computationally efficient encoding methods, e.g. Bloom filter encoding, has been used to develop PPRL with provable privacy guarantees. The standard DP notion does not however address other constraints, among which the most important ones are fairness-bias and cost of linkage in terms of number of record pairs to be compared. In this work, we propose new notions of fairness-constrained DP and fairness and cost-constrained DP for PPRL and develop a framework for PPRL with these new notions of DP combined with Bloom filter encoding. We provide theoretical proofs for the new DP notions for fairness and cost-constrained PPRL and experimentally evaluate them on two datasets containing person-specific data. Our experimental results show that with these new notions of DP, PPRL with better performance (compared to the standard DP notion for PPRL) can be achieved with regard to privacy, cost and fairness constraints.
更多
查看译文
关键词
Couplings, Costs, Privacy, Encoding, Databases, Computational efficiency, Differential privacy, Differential privacy, bloom filter encoding, record linkage, data matching, fairness, cost
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要