Generative AI to Generate Test Data Generators
CoRR(2024)
摘要
Generating fake data is an essential dimension of modern software testing, as
demonstrated by the number and significance of data faking libraries. Yet,
developers of faking libraries cannot keep up with the wide range of data to be
generated for different natural languages and domains. In this paper, we assess
the ability of generative AI for generating test data in different domains. We
design three types of prompts for Large Language Models (LLMs), which perform
test data generation tasks at different levels of integrability: 1) raw test
data generation, 2) synthesizing programs in a specific language that generate
useful test data, and 3) producing programs that use state-of-the-art faker
libraries. We evaluate our approach by prompting LLMs to generate test data for
11 domains. The results show that LLMs can successfully generate realistic test
data generators in a wide range of domains at all three levels of
integrability.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要