Understanding Large Language Model Based Fuzz Driver Generation

CoRR(2023)

引用 0|浏览52
暂无评分
摘要
Fuzz drivers are a necessary component of API fuzzing. However, automatically generating correct and robust fuzz drivers is a difficult task. Compared to existing approaches, LLM-based (Large Language Model) generation is a promising direction due to its ability to operate with low requirements on consumer programs, leverage multiple dimensions of API usage information, and generate human-friendly output code. Nonetheless, the challenges and effectiveness of LLM-based fuzz driver generation remain unclear. To address this, we conducted a study on the effects, challenges, and techniques of LLM-based fuzz driver generation. Our study involved building a quiz with 86 fuzz driver generation questions from 30 popular C projects, constructing precise effectiveness validation criteria for each question, and developing a framework for semi-automated evaluation. We designed five query strategies, evaluated 36,506 generated fuzz drivers. Furthermore, the drivers were compared with manually written ones to obtain practical insights. Our evaluation revealed that: while the overall performance was promising (passing 91% of questions), there were still practical challenges in filtering out the ineffective fuzz drivers for large scale application; basic strategies achieved a decent correctness rate (53%), but struggled with complex API-specific usage questions. In such cases, example code snippets and iterative queries proved helpful; while LLM-generated drivers showed competent fuzzing outcomes compared to manually written ones, there was still significant room for improvement, such as incorporating semantic oracles for logical bugs detection.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要