MapReduce-Based D_ELT Framework to Address the Challenges of Geospatial Big Data

ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION(2019)

引用 7|浏览8
暂无评分
摘要
The conventional extracting-transforming-loading (ETL) system is typically operated on a single machine not capable of handling huge volumes of geospatial big data. To deal with the considerable amount of big data in the ETL process, we propose D_ELT (delayed extracting-loading -transforming) by utilizing MapReduce-based parallelization. Among various kinds of big data, we concentrate on geospatial big data generated via sensors using Internet of Things (IoT) technology. In the IoT environment, update latency for sensor big data is typically short and old data are not worth further analysis, so the speed of data preparation is even more significant. We conducted several experiments measuring the overall performance of D_ELT and compared it with both traditional ETL and extracting-loading- transforming (ELT) systems, using different sizes of data and complexity levels for analysis. The experimental results show that D_ELT outperforms the other two approaches, ETL and ELT. In addition, the larger the amount of data or the higher the complexity of the analysis, the greater the parallelization effect of transform in D_ELT, leading to better performance over the traditional ETL and ELT approaches.
更多
查看译文
关键词
ETL,ELT,big data,sensor data,IoT,geospatial big data,MapReduce
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要