Node-Differentially Private Estimation of the Number of Connected Components

PROCEEDINGS OF THE 42ND ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, PODS 2023(2023)

引用 0|浏览37
暂无评分
摘要
We design the first node-differentially private algorithm for approximating the number of connected components in a graph. Given a database representing an n-vertex graph G and a privacy parameter epsilon, our algorithm runs in polynomial time and, with probability 1-o (1), has additive error (O) over tilde (Delta* ln ln n/epsilon), where Delta* is the smallest possible maximum degree of a spanning forest of G. Node-differentially private algorithms are known only for a small number of database analysis tasks. A major obstacle for designing such an algorithm for the number of connected components is that this graph statistic is not robust to adding one node with arbitrary connections (a change that node-differential privacy is designed to hide): every graph is a neighbor of a connected graph. We overcome this by designing a family of efficiently computable Lipschitz extensions of the number of connected components or, equivalently, the size of a spanning forest. The construction of the extensions, which is at the core of our algorithm, is based on the forest polytope of G. We prove several combinatorial facts about spanning forests, in particular, that a graph with no induced Delta-stars has a spanning forest of degree at most Delta. With this fact, we show that our Lipschitz extensions for the number of connected components equal the true value of the function for the largest possible monotone families of graphs. More generally, on all monotone sets of graphs, the l(infinity) error of our Lipschitz extensions is nearly optimal.
更多
查看译文
关键词
Differential Privacy,Graph Databases,Spanning Forest
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要