Human-centred design on crowdsourcing annotation towards improving active learning model performance

Jing Dong,Yangyang Kang,Jiawei Liu,Changlong Sun,Shu Fan, Huchong Jin,Dan Wu,Zhuoren Jiang,Xi Niu,Xiaozhong Liu

JOURNAL OF INFORMATION SCIENCE（2023）

引用 0|浏览1

暂无评分

摘要

Active learning in machine learning is an effective approach to reducing the cost of human efforts for generating labels. The iterative process of active learning involves a human annotation step, during which crowdsourcing could be leveraged. It is essential for organisations adopting the active learning method to obtain a high model performance. This study aims to identify effective crowdsourcing interaction designs to promote the quality of human annotations and therefore the natural language processing (NLP)-based machine learning model performance. Specifically, the study experimented with four human-centred design techniques: highlight, guidelines, validation and text amount. Based on different combinations of the four design elements, the study developed 15 different annotation interfaces and recruited crowd workers to annotate texts with these interfaces. Annotated data under different designs were used separately to iteratively train a machine learning model. The results show that the design techniques of highlight and guideline play an essential role in improving the quality of human labels and therefore the performance of active learning models, while the impact of validation and text amount on model performance can be either positive in some cases or negative in other cases. The 'simple' designs (i.e. D1, D2, D7 and D14) with a few design techniques contribute to the top performance of models. The results provide practical implications to inspire the design of a crowdsourcing labelling system used for active learning.

查看译文

关键词

Active learning, annotation cost, crowdsourcing, ground truth labels, human annotations, human-centred design

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要