Fahes: Detecting Disguised Missing Values

2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE)(2018)

引用 12|浏览83
暂无评分
摘要
It is well established that missing values, if not dealt with properly, may lead to poor data analytics models, misleading conclusions, and limitation in the generalization of findings. A key challenge in detecting these missing values is when they manifest themselves in a form that is otherwise valid, making it hard to distinguish them from other legitimate values. We propose to demonstrate FAHES, a system for detecting different types of disguised missing values (DMVs) which often occur in real world data. FAHES consists of several components, namely a profiler to generate rules for detecting repeated patterns, an outlier detection module, and a module to detect values that are used repeatedly in random records. Using several real world datasets, we will demonstrate how FAHES can easily catch DMVs.
更多
查看译文
关键词
disguised missing values
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要