Impact of data quality for automatic issue classification using pre-trained language models

JOURNAL OF SYSTEMS AND SOFTWARE(2024)

引用 0|浏览3
暂无评分
摘要
Issue classification aims to recognize whether an issue reports a bug, a request for enhancement or support. In this paper we use pre-trained models for the automatic classification of issues and investigate how the quality of data affects the performance of classifiers. Despite the application of data quality filters, none of our attempts had a significant effect on model quality. As root cause we identify a threat to construct validity underlying the issue labeling. Editor's note: Open Science material was validated by the Journal of Systems and Software Open Science Board.
更多
查看译文
关键词
GitHub,Issue trackers,Issue labeling,BERT,Model quality,Label correctness
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要