Brief Announcement: Scalable Diversity Maximization via Small-size Composable Core-sets

The 31st ACM on Symposium on Parallelism in Algorithms and Architectures(2019)

引用 5|浏览83
暂无评分
摘要
In this paper, we study the diversity maximization problem (a.k.a. maximum dispersion problem) in which given a set of n objects in a metric space, one wants to find a subset of k distinct objects with the maximum sum of pairwise distances. We address this problem using the distributed framework known as randomized composable core-sets[3]. Unlike previous work, we study small-size core-set algorithms allowing minimum possible intermediate output size (and hence achieving large speed-up in the computation and increased parallelism), and at the same time, improving significantly over the approximation guarantees of state-of-the-art core-set-based algorithms. In particular, we present a simple distributed algorithm that achieves an almost optimal communication complexity, and asymptotically achieves approximation factor of 1/2, matching the best known global approximation factor for this problem. Our algorithms are scalable and practical as shown by our extensive empirical evaluation with large datasets and they can be easily used in the major distributed computing systems like MapReduce. Furthermore, we show empirically that, in real-life instances, using small-size core-set algorithms allows speed-ups up to >68 in running time w.r.t. to large-size core-sets while achieving close-to-optimal solutions with approximation factor of >90%.
更多
查看译文
关键词
distributed algorithms, diversity maximization, randomized composable core-sets, small-size core-sets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要