Crowdsourcing Human Annotation on Web Page Structure: Infrastructure Design and Behavior-Based Quality Control.

ACM TIST(2016)

引用 13|浏览74
暂无评分
摘要
Parsing the semantic structure of a web page is a key component of web information extraction. Successful extraction algorithms usually require large-scale training and evaluation datasets, which are difficult to acquire. Recently, crowdsourcing has proven to be an effective method of collecting large-scale training data in domains that do not require much domain knowledge. For more complex domains, researchers have proposed sophisticated quality control mechanisms to replicate tasks in parallel or sequential ways and then aggregate responses from multiple workers. Conventional annotation integration methods often put more trust in the workers with high historical performance; thus, they are called performance-based methods. Recently, Rzeszotarski and Kittur have demonstrated that behavioral features are also highly correlated with annotation quality in several crowdsourcing applications. In this article, we present a new crowdsourcing system, called Wernicke, to provide annotations for web information extraction. Wernicke collects a wide set of behavioral features and, based on these features, predicts annotation quality for a challenging task domain: annotating web page structure. We evaluate the effectiveness of quality control using behavioral features through a case study where 32 workers annotate 200 Q&A web pages from five popular websites. In doing so, we discover several things: (1) Many behavioral features are significant predictors for crowdsourcing quality. (2) The behavioral-feature-based method outperforms performance-based methods in recall prediction, while performing equally with precision prediction. In addition, using behavioral features is less vulnerable to the cold-start problem, and the corresponding prediction model is more generalizable for predicting recall than precision for cross-website quality analysis. (3) One can effectively combine workers’ behavioral information and historical performance information to further reduce prediction errors.
更多
查看译文
关键词
Design,Algorithms,Performance,Crowdsourcing,quality control,behavioral features,worker performance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要