Crowdsourcing-Based Data Extraction From Visualization Charts

2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020)(2020)

引用 8|浏览96
暂无评分
摘要
Visualization charts are widely utilized for presenting structured data. Under many circumstances, people want to explore the data in the charts collected from various sources, such as papers and websites, so as to further analyzing the data or creating new charts. However, the existing automatic and semi-automatic approaches are not always effective due to the variety of charts. In this paper, we introduce a crowdsourcing approach that leverages human ability to extract data from visualization charts. There are several challenges. The first one is how to avoid tedious human interaction with charts and design simple crowdsourcing tasks. Second, it is challenging to evaluate worker's quality for truth inference, because workers may not only provide inaccurate values but also misalign values to wrong data series. To address the challenges, we design an effective crowdsourcing task scheme that splits a chart into simple micro-tasks. We introduce a novel worker quality model by considering worker's accuracy and task difficulty. We also devise an effective early-stopping mechanisms to save the cost. We have conducted experiments on a real crowdsourcing platform, and the results show that our framework outperforms state-of-the-art approaches on both cost and quality.
更多
查看译文
关键词
visualization charts,structured data,semiautomatic approaches,crowdsourcing approach,design simple crowdsourcing tasks,wrong data series,effective crowdsourcing task scheme,data extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要