CoCoFuzzing: Testing Neural Co de Models With Co verage-Guided Fuzzing
IEEE Transactions on Reliability(2023)
摘要
Deep learning (DL)-based code processing models have demonstrated good performance for tasks such as method name prediction, program summarization, and comment generation. However, despite the tremendous advancements, DL models are frequently susceptible to adversarial attacks, which pose a significant threat to the robustness and generalizability of these models by causing them to misclassify unexpected inputs. To address the issue above, numerous DL testing approaches have been proposed; however, these approaches primarily target testing DL applications in the domains of image, audio, and text analysis, etc., and cannot be “directly applied” to “neural models for code” due to the unique properties of programs. In this article, we propose a coverage-based fuzzing framework,
CoCoFuzzing
, for testing DL-based code processing models. In particular, we first propose 10 mutation operators to automatically generate validly and semantically preserving source code examples as tests, followed by a neuron coverage (NC)-based approach for guiding the generation of tests. The performance of
CoCoFuzzing
is evaluated using three state-of-the-art neural code models, i.e., NeuralCodeSum, CODE2SEQ, and CODE2VEC. Our experiment results indicate that
CoCoFuzzing
can generate validly and semantically preserving source code examples for testing the robustness and generalizability of these models and enhancing NC. Furthermore, these tests can be used for adversarial retraining to improve the performance of neural code models.
更多查看译文
关键词
Code model,deep learning (DL),fuzzy logic,language model,robustness
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要