Evaluating classifiers in SE research: the ECSER pipeline and two replication studies

EMPIRICAL SOFTWARE ENGINEERING(2022)

引用 3|浏览10
暂无评分
摘要
Context Automated classifiers, often based on machine learning (ML), are increasingly used in software engineering (SE) for labelling previously unseen SE data. Researchers have proposed automated classifiers that predict if a code chunk is a clone, if a requirement is functional or non-functional, if the outcome of a test case is non-deterministic, etc. Objective The lack of guidelines for applying and reporting classification techniques for SE research leads to studies in which important research steps may be skipped, key findings might not be identified and shared, and the readers may find reported results (e.g., precision or recall above 90%) that are not a credible representation of the performance in operational contexts. The goal of this paper is to advance ML4SE research by proposing rigorous ways of conducting and reporting research. Results We introduce the ECSER (Evaluating Classifiers in Software Engineering Research) pipeline, which includes a series of steps for conducting and evaluating automated classification research in SE. Then, we conduct two replication studies where we apply ECSER to recent research in requirements engineering and in software testing. Conclusions In addition to demonstrating the applicability of the pipeline, the replication studies demonstrate ECSER ’s usefulness: not only do we confirm and strengthen some findings identified by the original authors, but we also discover additional ones. Some of these findings contradict the original ones.
更多
查看译文
关键词
Automated classification,Machine learning,Software engineering,Replication study
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要