Truth discovery on multi-dimensional properties of data sources

Proceedings of the ACM Turing Celebration Conference - China(2019)

引用 0|浏览49
暂无评分
摘要
In the era of information explosion, data fusion has captured increasing attention from researchers as it plays an important part in data application. However, resolving the inconsistency of information generated by various data sources, i.e., truth discovery, has posed great challenges to data fusion. Although existing truth discovery methods mainly focus on source quality, they conduct the truth derivation iteratively only based on the accuracy of sources rather than the recall, which leads to the bad precision. Considering the assumption, sources are independent, cannot satisfy the reality of diverse correlations between them any more. A Gaussian Truth Finder with Correlations (GTFC) algorithm has been proposed in this paper. GTFC iteratively derives the truth, accuracy and recall of sources. The empirical results demonstrate that GTFC can significantly outperform the state-of-the-art algorithms.
更多
查看译文
关键词
data fusion, gaussian distribution, numeric attributes, source embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要