A Comparative Study of Log-Structured Merge-Tree-Based Spatial Indexes for Big Data

2017 IEEE 33rd International Conference on Data Engineering (ICDE)(2017)

引用 20|浏览34
暂无评分
摘要
The proliferation of GPS-enabled mobile devices has generated geo-tagged data at an unprecedented rate over the past decade. Data-processing systems that aim to ingest, store, index, and analyze Big Data must deal with such geo-tagged data efficiently. In this paper, among representative, disk-resident spatial indexing methods that have been adopted by major SQL and NoSQL systems, we implement five variants of these methods in the form of Log-Structured Merge-tree-based (LSM) spatial indexes in order to evaluate their pros and cons for dynamic geo-tagged Big Data. We have implemented the alternatives, including LSM-based B-tree, R-tree, and inverted index variants, in Apache AsterixDB, an open source Big Data management system. This implementation enabled comparison in terms of real end-to-end performance, including logging and locking overheads, in a full-function, query-based system setting. Our evaluation includes both static and dynamic workloads, ranging from a "load once, query many" case to a case where continuous concurrent incremental inserts are mixed with concurrent queries. Based on the results, we discuss the pros and cons of the five index variants.
更多
查看译文
关键词
log-structured merge-tree-based spatial indexes,GPS-enabled mobile device proliferation,data-processing systems,Big Data analysis,Big Data indexing,Big Data storage,Big Data ingestion,disk-resident spatial indexing methods,NoSQL system,LSM-spatial indexes,dynamic geo-tagged Big Data,LSM-based B-tree index,R-tree index,inverted index,Apache AsterixDB,open source Big Data management system,logging overhead,locking overhead,full-function-query-based system,static workloads,dynamic workloads,load once-query many method,continuous concurrent incremental inserts,concurrent queries
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要