To Reveal the Performance Secrets of the Newest NN Searching Algorithm

msra(2008)

引用 23|浏览10
暂无评分
摘要
Nearest Neighbor (NN) search has been widely used in spatial databases and multimedia databases. Incremental NN (INN) search algorithm is regarded as the optimal NN search because of the minimum number of node accesses and it can be used no matter whether the number of objects to be retrieved is fixed or not in advance. This paper presents an analytical model for estimating performance of the INN search algorithm. For the first time, our model takes m (the number of neighbor objects reported finally), n (the cardinality of database) and d (the dimensionality) as parameters, focusing on the number of node accesses (not only the number of accessed leaf nodes) and the length of the priority queue. Using our model, dimensionality curse is mathematically revealed for an arbitrary number of NN objects retrieved. In our model, (1) for the first time, the two key factors of d m (the distance from the m-th NN object to the query point) and σh (the side length of each node) are estimated using their upper bounds and their lower bounds, which is helpful to effectiveness of our model, especially in high-dimensional spaces; (2) for the first time, the possible difference of fanouts among the leaf nodes, the root node and the others is taken into account. The theoretical analysis is verified by experiments.
更多
查看译文
关键词
priority queue,nearest neighbor,lower bound,search algorithm,spatial database,upper bound
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要