Incremental Relational Topic Model for Duplicate Bug Report Detection

2022 29th Asia-Pacific Software Engineering Conference (APSEC)(2022)

引用 0|浏览16
暂无评分
摘要
In software development, bug fixing is a timeconsuming, yet unavoidable task. A bug is occasionally reported by more than one reporters, resulting in duplicate bug reports. Detecting duplicate bug reports is crucial because it helps reduce the maintenance efforts from developers as well as provides more information in the bug fixing process. In this paper, we propose an automatic approach to this problem. In our approach, a bug report is considered as a textual document describing one or more technical aspects of a software system, in which some of them might be erroneously implemented. The reports similarly describing the same erroneous technical aspects are considered as duplicate ones. We utilize Relational Topic Model (RTM), a probabilistic, generative topic model, to formulate the probabilistic structures of technical aspects in a collection of bug reports and the duplication indicators among them. Trained with historical data including identified duplicate reports, the model can be used to detect other not-yet-identified duplicate ones. To support software evolution, we extend RTM into incremental RTM(iRTM) in which the trained model can be quickly updated without spending a large amount of time for complete re-training when new reports are filed or additional duplication information is available. Our empirical evaluation on large systems shows that iRTM outperforms the state-of-the-art approaches, achieving up to 90% top-10 accuracy with up to 8 times faster in updating its internal model as new reports arrive.
更多
查看译文
关键词
Relational Topic Model,Duplicate Bug Reports
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要