A comparative study of cluster-based Big Data Cube implementations

André Francisco Morielo Caetano,Celso Massaki Hirata,Rodrigo Rocha Silva

Future Generation Computer Systems(2022)

引用 0|浏览26
暂无评分
摘要
Research on Data Cubes scalability is extensive, yet sparse. Scalable design patterns for Data Cube implementations are a trend as the technology shifts from centralized and fully materialized models to distributed and partially materialized ones. The implementations explore cheaper and upgraded hardware in clusters of computer nodes. It is a common understanding that the parallel and distributed hardware enables to handle large amounts of multidimensional data for online analytical processing, up to billions of tuples or more, with increased performance and fault tolerance. However, the number of research works and their heterogeneity may overwhelm new initiatives in this field, as there is little discussion regarding the state-of-the-art and ways for improvement. Moreover, the baseline for comparison in most works is often too limited and requires that the reader crosscheck the information among many articles to identify possible gaps. In order to help identifying these gaps, we analyzed the works on Data Cube scalability and elaborated a comparative study that provides directions for new research on the parallel and distributed implementations of data cubes. We identified some features for comparison that include cube function, implementation technology, cube storage type, and various experiments information. We expect that the features and comparisons help researchers to identify research gaps and pave ways for future works on the field.
更多
查看译文
关键词
Datacube,OLAP,Cloud,Big Data,Survey,Distributed,Parallel
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要