Use of Apache Flume in the Big Data Environment for Processing and Evaluation of the Data Quality of the Twitter Social Network

INFORMATION AND COMMUNICATION TECHNOLOGIES OF ECUADOR (TIC.EC)(2019)

引用 1|浏览1
暂无评分
摘要
The present work uses Hadoop as the core processing in the Big Data environment. There are several open sources tools from the Hadoop ecosystem that facilitate the processing of enormous volumes of data. In this paper, we have worked with Apache Flume and Apache Hive tools for the study case of the 2017 presidential elections in Ecuador. The analysis of data generated from Twitter social network focuses mainly in the first round of balloting of Ecuador's 2017 presidential election. These generated data have been obtained, stored, processed and analyzed to comply with the characteristics of the information that is considered Big Data. The selected tools have been evaluated in their architecture, installation, and use. Finally, the data have been evaluated under certain quality criteria or dimensions.
更多
查看译文
关键词
Big Data,Twitter,Apache Flume
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要