TestAug: A Framework for Augmenting Capability-based NLP Tests.

International Conference on Computational Linguistics(2022)

引用 0|浏览26
暂无评分
摘要
The recently proposed capability-based NLP testing allows model developers to test the functional capabilities of NLP models, revealing functional failures for models with good held-out evaluation scores. However, existing work on capability-based testing requires the developer to compose each individual test template from scratch. Such approach thus requires extensive manual efforts and is less scalable. In this paper, we investigate a different approach that requires the developer to only annotate a few test templates, while leveraging the GPT-3 engine to generate the majority of test cases. While our approach saves the manual efforts by design, it guarantees the correctness of the generated suites with a validity checker. Moreover, our experimental results show that the test suites generated by GPT-3 are more diverse than the manually created ones; they can also be used to detect more errors compared to manually created counterparts. Our test suites can be downloaded at https://anonymous-researcher-nlp.github.io/testaug/.
更多
查看译文
关键词
testaug,tests,capability-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要