Locally Private Set-valued Data Analyses: Distribution and Heavy Hitters Estimation

Shaowei Wang,Yuntong Li, Yusen Zhong,Kongyang Chen,Xianmin Wang,Zhili Zhou,Fei Peng,Yuqiu Qian,Jiachun Du,Wei Yang

IEEE Transactions on Mobile Computing（2023）

引用 0|浏览3

暂无评分

摘要

In many mobile applications, user-generated data are presented as set-valued data. To tackle potential privacy threats in analyzing these valuable data, local differential privacy has been attracting substantial attention. However, existing approaches only provide sub-optimal utility and are expensive in computation and communication for set-valued data distribution estimation and heavy-hitter identification. In this paper, we propose a utility-optimal and efficient set-valued data publication method (i.e., Wheel mechanism ). On the user side, the computational complexity is only

$O(\min \lbrace m\log m, m e^\epsilon \rbrace )$

and communication costs are

$O(\epsilon +\log m)$

bits, where

$m$

is the number of items,

$d$

is the domain size and

$\epsilon$

is the privacy budget, while existing approaches usually depend on

$O(d)$

$O(\log d)$

(

$d \gg m$

). Our theoretical analyses reveal the estimation errors have been reduced from the previously known

$O(\frac{m^{2} d}{n\epsilon ^{2}})$

to the optimal rate

$O(\frac{m d}{n\epsilon ^{2}})$

. Additionally, for heavy-hitter identification, we present a variant of the Wheel mechanism as an efficient frequency oracle, entailing only

$O(\sqrt{n})$

computational complexity. This heavy-hitter protocol achieves an identification bar of

$\tilde{O}(\frac{1}{\epsilon }\sqrt{\frac{m}{n} \log d})$

, reducing by a factor of

$\sqrt{m}$

relative to existing protocols. Extensive experiments demonstrate our methods are 3-100x faster than existing approaches and have optimized statistical efficiency.

查看译文

关键词

local differential privacy,frequency estimation,heavy-hitter identification,distributed data aggregation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要