Sequence Data Matching And Beyond: New Privacy-Preserving Primitives Based On Bloom Filters

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY(2020)

引用 26|浏览58
暂无评分
摘要
Bloom filter encoding has widely been used as an efficient masking technique for privacy-preserving matching functions. The existing matching techniques, however, are limited to relatively simple types such as string, categorical and signal numerical values. In this paper, we propose a new scheme that significantly extends the class of matching primitives that are based on privacy-preserving Bloom filter mechanism. These primitives include sequence data matching and popular distance-based machine learning algorithms such as KNN and SVM. Our scheme hash-maps a sequence data vector into the Bloom filter space while checking the similarity of the data points efficiently with negligible utility loss by adding a timestamp (bit) for each element in the data represented with its neighboring values. Furthermore, it includes a Laplace-like perturbation method on the constructed Bloom filters to address the weakness of deterministic probability led by encoding techniques. As a result, the proposed work guarantee the private data records are difficult to be discriminated due to collisions and differential privacy. The experimental results on three real-scenario based datasets illustrate that our method can achieve a significantly better trade-off between utility and privacy than the state-of-the-art differential privacy-based method by adding Laplace noise to the data directly.
更多
查看译文
关键词
Sequence data matching, encoding, bloom filter, privacy-preserving data publishing, differential privacy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要