Investigating the Effectiveness of Clustering for Story Point Estimation

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)(2022)

引用 8|浏览12
暂无评分
摘要
Automated techniques to estimate Story Points (SP) for user stories in agile software development came to the fore a decade ago. Yet, the state-of-the-art estimation techniques' accuracy has room for improvement. In this paper, we present a new approach for SP estimation, based on analysing textual features of software issues by employing latent Dirichlet allocation (LDA) and clustering. We first use LDA to represent issue reports in a new space of generated topics. We then use hierarchical clustering to agglomerate issues into clusters based on their topic similarities. Next, we build estimation models using the issues in each cluster. Then, we find the closest cluster to the new coming issue and use the model from that cluster to estimate the SP. Our approach is evaluated on a dataset of 26 open source projects with a total of 31,960 issues and compared against both baselines and state-of-the-art SP estimation techniques. The results show that the estimation performance of our proposed approach is as good as the state-of-the-art. However, none of these approaches is statistically significantly better than more naive estimators in all cases, which does not justify their additional complexity. We therefore encourage future work to develop alternative strategies for story points estimation. The experimental data and scripts we used in this work are publicly available to allow for replication and extension.
更多
查看译文
关键词
Software Effort Estimation,Story Point Estimation,Latent Dirichlet Allocation,Hierarchical Clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要