Just another copy and paste? Comparing the security vulnerabilities of ChatGPT generated code and StackOverflow answers
arxiv(2024)
摘要
Sonatype's 2023 report found that 97
integrate generative Artificial Intelligence (AI), particularly Large Language
Models (LLMs), into their development process. Concerns about the security
implications of this trend have been raised. Developers are now weighing the
benefits and risks of LLMs against other relied-upon information sources, such
as StackOverflow (SO), requiring empirical data to inform their choice. In this
work, our goal is to raise software developers awareness of the security
implications when selecting code snippets by empirically comparing the
vulnerabilities of ChatGPT and StackOverflow. To achieve this, we used an
existing Java dataset from SO with security-related questions and answers.
Then, we asked ChatGPT the same SO questions, gathering the generated code for
comparison. After curating the dataset, we analyzed the number and types of
Common Weakness Enumeration (CWE) vulnerabilities of 108 snippets from each
platform using CodeQL. ChatGPT-generated code contained 248 vulnerabilities
compared to the 302 vulnerabilities found in SO snippets, producing 20
vulnerabilities with a statistically significant difference. Additionally,
ChatGPT generated 19 types of CWE, fewer than the 22 found in SO. Our findings
suggest developers are under-educated on insecure code propagation from both
platforms, as we found 274 unique vulnerabilities and 25 types of CWE. Any code
copied and pasted, created by AI or humans, cannot be trusted blindly,
requiring good software engineering practices to reduce risk. Future work can
help minimize insecure code propagation from any platform.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要