Prompting Datasets: Data Discovery with Conversational Agents
CoRR(2023)
摘要
Can large language models assist in data discovery? Data discovery
predominantly happens via search on a data portal or the web, followed by
assessment of the dataset to ensure it is fit for the intended purpose. The
ability of conversational generative AI (CGAI) to support recommendations with
reasoning implies it can suggest datasets to users, explain why it has done so,
and provide information akin to documentation regarding the dataset in order to
support a use decision. We hold 3 workshops with data users and find that,
despite limitations around web capabilities, CGAIs are able to suggest relevant
datasets and provide many of the required sensemaking activities, as well as
support dataset analysis and manipulation. However, CGAIs may also suggest
fictional datasets, and perform inaccurate analysis. We identify emerging
practices in data discovery and present a model of these to inform future
research directions and data prompt design.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要