Distributed Spatio-Temporal k Nearest Neighbors Join

Geographic Information Systems(2021)

引用 3|浏览54
暂无评分
摘要
ABSTRACTThe rapid development of positioning technology produces an extremely large volume of spatio-temporal data with various geometry types such as point, line string, polygon, or a mixed combination of them. As one of the most basic but time-consuming operations, k nearest neighbors join (kNN join) has attracted much attention. However, most existing works for kNN join either ignore temporal information or consider point data only. This paper proposes a novel and useful problem, i.e., ST-kNN join, which considers both spatial closeness and temporal concurrency. To support ST-kNN join over a huge amount of spatio-temporal data with any geometry types efficiently, we propose a novel distributed solution based on Apache Spark. Specifically, our method adopts a two-round join framework. In the first round join, we propose a new spatio-temporal partitioning method that achieves spatio-temporal locality and load balance at the same time. We also propose a lightweight index structure, i.e., Time Range Count Index (TRC-index), to enable efficient ST-kNN join. In the second round join, to reduce the data transmission among different machines, we remove duplicates based on spatio-temporal reference points before shuffling local results. Extensive experiments are conducted using three real big datasets, showing that our method is much more scalable and achieves 9X faster than baselines. A demonstration system is deployed and the source code is released.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要