Top- k spatial distance joins

GeoInformatica(2020)

引用 2|浏览126
暂无评分
摘要
Top- k joins have been extensively studied when numerical valued attributes are joined on an equality predicate. Other types of join attributes and predicates have received little to no attention. In this paper, we consider spatial objects that are assigned a score (e.g., a ranking). Give two collections R , S of such objects and a spatial distance threshold 𝜖 , we introduce the top- k spatial distance join (k-SDJoin) to identify the k pairs of objects, which have the highest combined score (based on an aggregate function γ ) among all object pairs in R × S with a spatial distance at most 𝜖 . State-the-of-art methods for relational top- k joins can be adapted for k-SDJoin, but their focus is on minimizing the number of objects accessed from the inputs; however, when spatial objects are joined, the computational cost can easily become the bottleneck. In view of this, we propose a novel evaluation algorithm, which greatly reduces the computational cost, without compromising the access cost. The main idea is to access and efficiently join blocks of objects from each collection, using appropriate bounds to avoid computing the entire spatial 𝜖 -distance join. As the performance of our solution heavily relies on the size of the input blocks, we devise an approach for automated block size tuning enhanced by a novel generic model for estimating the number of objects to be accessed from each input. Contrary to previous efforts, our model employs cheap-to-compute statistics and requires no prior knowledge of data distribution. Our extensive experimental analysis demonstrates the efficiency of our algorithm compared to methods based on existing literature that prioritize either the ranking or the spatial join component of k-SDJoin queries.
更多
查看译文
关键词
Top-k join,Spatial join
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要