Automated data verification in a format-free environment

ACM SIGSOFT Software Engineering Notes(2006)

引用 21|浏览6
暂无评分
摘要
Data collection and interpretation are vital for innumerable purposes: both commercial and academic. Sifting through vast mountains of data to separate correct information from incorrect can be expensive both in terms of money and of time. Automation of as much of this process as possible is the key to collecting useful information in an efficient and timely manner. This paper discusses a system designed to automate the comparison of raw collected data to store of previously verified data. This comparison can be used both to estimate the accuracy and the value of the collected data. In addition, it is possible to gauge the efficacy of various collection methods. In this system special attention was paid to accepting a wide range of document formats and to properly handling data sets whose attribute types might be differently organized than those in the reference data.
更多
查看译文
关键词
data collection,data mining,automated data verification,mining methods and algorithms,attribute type,useful information,various collection method,system special attention,document analysis,reference data,verification,format-free environment,correct information,document format,innumerable purpose,system design
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要