Comparing Word-Based and AST-Based Models for Design Pattern Recognition
PROMISE 2023: Proceedings of the 19th International Conference on Predictive Models and Data Analytics in Software Engineering(2023)
摘要
Design patterns (DPs) provide reusable and general solutions for frequently encountered problems. Patterns are important to maintain the structure and quality of software products, in particular
in large and distributed systems like automotive software. Modern language models (like Code2Vec or Word2Vec) indicate a deep understanding of programs, which has been shown to help in such
tasks as program repair or program comprehension, and therefore show promise for DPR in industrial contexts. The models are trained in a self-supervised manner, using a large unlabelled code
base, which allows them to quantify such abstract concepts as programming styles, coding guidelines, and, to some extent, the semantics of programs. This study demonstrates how two language
models—Code2Vec and Word2Vec, trained on two public automotive repositories, can show the separation of programs containing specific DPs. The results show that the Code2Vec and Word2Vec
produce average F1-scores of 0.781 and 0.690 on open-source Java
programs, showing promise for DPR in practice.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要