CSEPrompts: A Benchmark of Introductory Computer Science Prompts
arxiv(2024)
摘要
Recent advances in AI, machine learning, and NLP have led to the development
of a new generation of Large Language Models (LLMs) that are trained on massive
amounts of data and often have trillions of parameters. Commercial applications
(e.g., ChatGPT) have made this technology available to the general public, thus
making it possible to use LLMs to produce high-quality texts for academic and
professional purposes. Schools and universities are aware of the increasing use
of AI-generated content by students and they have been researching the impact
of this new technology and its potential misuse. Educational programs in
Computer Science (CS) and related fields are particularly affected because LLMs
are also capable of generating programming code in various programming
languages. To help understand the potential impact of publicly available LLMs
in CS education, we introduce CSEPrompts, a framework with hundreds of
programming exercise prompts and multiple-choice questions retrieved from
introductory CS and programming courses. We also provide experimental results
on CSEPrompts to evaluate the performance of several LLMs with respect to
generating Python code and answering basic computer science and programming
questions.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要