Rank-Based Similarity Search: Reducing the Dimensional Dependence

Pattern Analysis and Machine Intelligence, IEEE Transactions  (2015)

引用 66|浏览35
暂无评分
摘要
This paper introduces a data structure for k-NN search, the Rank Cover Tree (RCT), whose pruning tests rely solely on the comparison of similarity values; other properties of the underlying space, such as the triangle inequality, are not employed. Objects are selected according to their ranks with respect to the query object, allowing much tighter control on the overall execution costs. A formal theoretical analysis shows that with very high probability, the RCT returns a correct query result in time that depends very competitively on a measure of the intrinsic dimensionality of the data set. The experimental results for the RCT show that non-metric pruning strategies for similarity search can be practical even when the representational dimension of the data is extremely high. They also show that the RCT is capable of meeting or exceeding the level of performance of state-of-the-art methods that make use of metric pruning or other selection tests involving numerical constraints on distance values.
更多
查看译文
关键词
pattern classification,probability,query processing,search problems,tree data structures,rct,data structure,formal theoretical analysis,intrinsic dimensionality,k-nn search,k-nearest-neighbor classification,nonmetric pruning strategies,overall execution costs,pruning tests,query object,rank cover tree,rank-based similarity search,nearest neighbor search,rank-based search,measurement,data mining,indexes,navigation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要