Bridging Speed and Accuracy to Approximate K-Nearest Neighbor Search
arxiv(2024)
摘要
Approximate K-Nearest Neighbor (AKNN) search in high-dimensional spaces is a
critical yet challenging problem. The efficiency of AKNN search largely depends
on the computation of distances, a process that significantly affects the
runtime. To improve computational efficiency, existing work often opts for
estimating approximate distances rather than computing exact distances, at the
cost of reduced AKNN search accuracy. The recent method of ADSampling has
attempted to mitigate this problem by using random projection for distance
approximations and adjusting these approximations based on error bounds to
improve accuracy. However, ADSampling faces limitations in effectiveness and
generality, mainly due to the suboptimality of its distance approximations and
its heavy reliance on random projection matrices to obtain error bounds. In
this study, we propose a new method that uses an optimal orthogonal projection
instead of random projection, thereby providing improved distance
approximations. Moreover, our method uses error quantiles instead of error
bounds for approximation adjustment, and the derivation of error quantiles can
be made independent of the projection matrix, thus extending the generality of
our approach. Extensive experiments confirm the superior efficiency and
effectiveness of the proposed method. In particular, compared to the
state-of-the-art method of ADSampling, our method achieves a speedup of 1.6 to
2.1 times on real datasets with almost no loss of accuracy.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要