Multi-label and Multi-target Sampling of Machine Annotation for Computational Stance Detection.
CoRR(2023)
摘要
Data collection from manual labeling provides domain-specific and
task-aligned supervision for data-driven approaches, and a critical mass of
well-annotated resources is required to achieve reasonable performance in
natural language processing tasks. However, manual annotations are often
challenging to scale up in terms of time and budget, especially when domain
knowledge, capturing subtle semantic features, and reasoning steps are needed.
In this paper, we investigate the efficacy of leveraging large language models
on automated labeling for computational stance detection. We empirically
observe that while large language models show strong potential as an
alternative to human annotators, their sensitivity to task-specific
instructions and their intrinsic biases pose intriguing yet unique challenges
in machine annotation. We introduce a multi-label and multi-target sampling
strategy to optimize the annotation quality. Experimental results on the
benchmark stance detection corpora show that our method can significantly
improve performance and learning efficacy.
更多查看译文
关键词
machine annotation,detection,multi-label,multi-target
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要