Truth Discovery In Material Science Databases

DATABASES THEORY AND APPLICATIONS(2015)

引用 2|浏览39
暂无评分
摘要
Instead of performing expensive experiments, it is common in industry to make predictions of important material properties based on some existing experimental results. Databases consisting of experimental observations are widely used in the field of Material Science Engineering. However, these databases are expected to be noisy since they rely on human measurements, and also because they are an amalgamation of various independent sources (research papers). Therefore, some conflicting information can be found between various sources. In this paper, we introduce a novel truth discovery approach to reduce the amount of noise and filter the incorrect conflicting information hidden in the scientific databases. Our method ranks the multiple data sources by considering the relationships between them, i.e., the amount of conflicting information and the amount of agreement, and as well eliminates the conflicting information. The scalable Gaussian process interpolation technique (SGP) is then applied to the clean dataset to make predictions of materials property. Comprehensive performance study has been done on a real life scientific database. With our new approach, we are able to highly improve the accuracy of SGP predictions and provide a more reliable database.
更多
查看译文
关键词
Noisy Data, Interpolation Technique, Truth Discovery, Gaussian Process Regression, Training Database
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要