Active(2) Learning: Actively reducing redundancies in Active Learning methods for Sequence Tagging and Machine Translation

arxiv(2021)

引用 10|浏览11
暂无评分
摘要
While deep learning is a powerful tool for natural language processing (NLP) problems, successful solutions to these problems rely heavily on large amounts of annotated samples. However, manually annotating data is expensive and time-consuming Active Learning (AL) strategies reduce the need for huge volumes of labeled data by iteratively selecting a small number of examples for manual annotation based on their estimated utility in training the given model. In this paper, we argue that since AL strategies choose examples independently, they may potentially select similar examples, all of which may not contribute significantly to the learning process. Our proposed approach, Active(2) Learning (A(2)L), actively adapts to the deep learning model being trained to eliminate such redundant examples chosen by an AL strategy. We show that A(2)L, is widely applicable by using it in conjunction with several different AL strategies and NLP tasks. We empirically demonstrate that the proposed approach is further able to reduce the data requirements of state-of-the-art AL strate- gies by approximate to 3 - 25% on an absolute scale on multiple NLP tasks while achieving the same performance with virtually no additional computation overhead.
更多
查看译文
关键词
sequence tagging,active$^2$ learning methods,active$^2$ learning,machine translation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要