Probabilistic clustering of high dimensional norms.
SODA(2017)
摘要
Separating decompositions of metric spaces are an important randomized clustering paradigm that was formulated by Bartal in [Bar96] and is defined as follows. Given a metric space (X, dX), its modulus of separated decomposability, denoted SEP (X, dX), is the infimum over those σ ∈ (0, ∞] such that for every finite subset S ⊆ X and every Δ > 0 there exists a distribution over random partitions P of S into sets of diameter at most Δ such that for every x, y ∈ S the probability that both x and y do not fall into the same cluster of the random partition P is at most σdX(x, y)/Δ. Here we obtain new bounds on SEP (X, ||·||X) when (X, ||·||X) is a finite dimensional normed space, yielding, as a special case, that [EQUATION] for every n ∈ ℕ. More generally, [EQUATION] for every p ∈ [2, ∞]. This improves over the work [CCG+98] of Charikar, Chekuri, Goel, Guha, and Plotkin, who obtained this bound when p = 2, yet for p ∈ (2, ∞] they obtained the asymptotically weaker estimate SEP(lnp) ≲ n1−1/p. One should note that it was claimed in [CCG+98] that the bound SEP(lnp) ≲ n1−1/p is sharp for every p ∈ [2, ∞], and in particular it was claimed in [CCG+98] that SEP(ln∞) ≍ n. However, the above results show that this claim of [CCG+98] is incorrect for every p ∈ (2, ∞]. Our new bounds on the modulus of separated decomposability rely on extremal results for orthogonal hyperplane projections of convex bodies, specifically using the work [BN02] of Barthe and the author. This yields additional refined estimates, an example of which is that for every n ∈ ℕ and k ∈ {1,...,n} we have [EQUATION], where (ln2)⩽k) denotes the subset of ℝn consisting of all those vectors that have at most k nonzero entries, equipped with the Euclidean metric. The above statements have implications to the Lipschitz extension problem through its connection to random partitions that was developed by Lee and the author in [LN04, LN05]. Given a metric space (X, dX), let e(X) denote the infimum over those K ∈ (0, ∞] such that for every Banach space Y and every subset S ⊆ X, every 1-Lipschitz function f : S → Y has a K-Lipschitz extension to all of X. Johnson, Lindenstrauss and Schechtman proved in [JLS86] that e(X) ≲ dim(X) for every finite dimensional normed space (X, || · ||X). It is a longstanding open problem to determine the correct asymptotic dependence on dim(X) in this context, with the best known lower bound, due to Johnson and Lindenstrauss [JL84], being that the quantity e(X) must sometimes be at least a constant multiple of [EQUATION]. In particular, the previously best known upper bound on e(ln∞) was the O(n) estimate of [JLS86]. It is shown here that for every n ∈ ℕ we have [EQUATION], thus answering (up to logarithmic factors) a question that was posed by Brudnyi and Brudnyi in [BB05, Problem 2]. More generally, [EQUATION] for every p ∈ [2, ∞], thus resolving (negatively) a conjecture of Brudnyi and Brudnyi in [BB05, Conjecture 5].
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络