SQUID: subtrajectory query in trillion-scale GPS database

VLDB JOURNAL(2023)

引用 0|浏览54
暂无评分
摘要
Subtrajectory query has been a fundamental operator in mobility data management and useful in the applications of trajectory clustering, co-movement pattern mining and contact tracing in epidemiology. In this paper, we make the first attempt to study subtrajectory query in trillion-scale GPS databases, so as to support applications with urban-scale moving users and weeks-long historical data. We develop SQUID as a distributed subtrajectory query processing engine on Spark, with threefold technical contributions. First, we propose compact index and storage layers to handle massive trajectory datasets with trillion-scale GPS points. Second, we leverage hybrid partitioning, together with local indexes that are disk I/O friendly, to facilitate pruning. Third, we devise a novel filter-and-refine query processing framework to effectively reduce the number of trajectories for verification. Our experiments are conducted on huge trajectory datasets with up to 520 billion GPS points. The results validate the compactness of the storage mechanism and the scalability of the distributed query processing framework.
更多
查看译文
关键词
Subtrajectory query,Subtrajectory join,Distributed computing,Trillion-scale GPS databases
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络