Efficiently Supporting Multiple Similarity Queries for Mining in Metric Databases

ICDE(2000)

引用 64|浏览77
暂无评分
摘要
Metric databases are databases where a metric distance function is defined for pairs of database objects. In such databases, similarity queries in the form of range queries or k-nearest neighbor queries are the most important queries. In traditional query processing, single queries are issued independently by different users. In many data mining applications, however, the database is typically explored by iteratively asking similarity queries for answers of previous similarity queries.In this paper, we introduce a generic scheme for such data mining algorithms and we investigate two orthogonal approaches, reducing I/O cost as well as CPU cost, to speed-up the processing of multiple similarity queries. The proposed techniques apply to any type of similarity query and to an implementation based on an index or using a sequential scan. Parallelization yields an additional impressive speed-up. An extensive performance evaluation confirms the efficiency of our approach.
更多
查看译文
关键词
database object,similarity query,multiple similarity query,metric databases,additional impressive speed-up,o cost,cpu cost,multiple similarity queries,data mining algorithm,previous similarity query,data mining application,range queries,euclidean distance,k nearest neighbor,distance function,histograms,range query,data mining,indexation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要