Efficient Density Evaluation for Smooth Kernels

2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS)(2018)

引用 32|浏览131
暂无评分
摘要
Given a kernel function k(.,.) and a dataset P⊂ R^d, the kernel density function of P at a point x∈ R d is equal to KDF P (x):= 1/|P| Σy∈P k(x, y). Kernel density evaluation has numerous applications, in scientific computing, statistics, computer vision, machine learning and other fields. In all of them it is necessary to evaluate KDF P(x) quickly, often for many inputs x and large point-sets P. In this paper we present a collection of algorithms for efficient KDF evaluation under the assumptions that the kernel k is "smooth", i.e. the value changes at most polynomially with the distance. This assumption is satisfied by several well-studied kernels, including the (generalized) t-student kernel and rational quadratic kernel. For smooth kernels, we give a data structure that, after O(dn log (Φ n)/ε^2) preprocessing, estimates KDF P(x) up to a factor of 1 ± ε in O(dlog (Φ n)/ε 2 ) time, where Phi; is the aspect ratio. The log(Φn) term can be further replaced by log n under an additional decay condition on k, which is satisfied by the aforementioned examples. We further extend the results in two ways. First, we use low-distortion embeddings to extend the results to kernels defined for spaces other than ℓ_2. The key feature of this reduction is that the distortion of the embedding affects only the running time of the algorithm, not the accuracy of the estimation. As a result, we obtain (1+ε)-approximate estimation algorithms for kernels over other ℓ p norms, Earth-Mover Distance, and other metric spaces. Second, for smooth kernels that are decreasing with distance, we present a general reduction from density estimation to approximate near neighbor in the underlying space. This allows us to construct algorithms for general doubling metrics, as well as alternative algorithms for l p norms and other spaces.
更多
查看译文
关键词
Heavy-tailed kernels, Dimensionality Reduction, Quad-Trees, Hashing, Approximate Nearest Neighbor Search
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要