Survey on Preprocessing Techniques for Big Data Projects

Engineering Proceedings(2021)

引用 1|浏览1
暂无评分
摘要
In the era of big data, a vast amount of data are being produced. This results in two main issues when trying to discover knowledge from these data. There is a lot of information that is not relevant to the problem we want to solve, and there are many imperfections and errors in the data. Therefore, preprocessing these data is a key step before applying any kind of learning algorithm. Reducing the number of features to a relevant subset (feature selection) and reducing the possible values of continuous variables (discretisation) are two of the main preprocessing techniques. This paper will review different methods for completing these two steps, focusing on the big data context and giving examples of projects where they have been applied.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要