GENUS: An ETL tool treating the Big Data Variety

Salwa Souissi,Mounir BenAyed

2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA)(2016)

引用 2|浏览7
暂无评分
摘要
The data warehouse is the most important component to supply a Business Intelligence system. It is at the core of the Decision Support System. It allows integrating data from different sources, often scattered and heterogeneous, with the purpose of helping managers in their decision-making. Thereby, the building of a data warehouse requires the execution of the Extraction-Transformation-Load (ETL) process. These recent years, the ETL is affected by the emergence of Big Data. This type of data sets was treated by some ETL studies in its 3V namely, the Volume, the Velocity, and the Variety. However, these studies do not treat the Variety of data types. Thus, in this paper, we introduce a new ETL tool that treats this aspect. GENUS, our proposed tool, extracts its data from different document types: text, image, and video, transform them, and load them to a document data warehouse. GENUS is implemented and validated in a commercial case study.
更多
查看译文
关键词
ETL tool,Big Data,extraction-transformation-load process,volume-velocity-variety,GENUS,data extraction,document data warehouse
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要