FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks

IEEE Transactions on Reliability(2021)

引用 45|浏览36
暂无评分
摘要
Code cloning, which reuses a fragment of source code via copy-and-paste with or without modifications, is a common way for code reuse and software prototyping. However, the duplicated code fragments often affect software quality, resulting in high maintenance cost. The existing clone detectors using shallow textual or syntactical features to identify code similarity are still ineffective in accurately finding sophisticated functional code clones in real-world code bases. This article proposes functional code clone detector using attention (FCCA), a deep-learning-based code clone detection approach on top of a hybrid code representation by preserving multiple code features, including unstructured (code in the form of sequential tokens) and structured (code in the form of abstract syntax trees and control-flow graphs) information. Multiple code features are fused into a hybrid representation, which is equipped with an attention mechanism that pays attention to important code parts and features that contribute to the final detection accuracy. We have implemented and evaluated FCCA using 275 777 real-world code clone pairs written in Java. The experimental results show that FCCA outperforms several state-of-the-art approaches for detecting functional code clones in terms of accuracy, recall, and F1 score.
更多
查看译文
关键词
Attention mechanism,code clone detection,code representation,deep neural network (DNN)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要