PURPLE: Making a Large Language Model a Better SQL Writer
arxiv(2024)
摘要
Large Language Model (LLM) techniques play an increasingly important role in
Natural Language to SQL (NL2SQL) translation. LLMs trained by extensive corpora
have strong natural language understanding and basic SQL generation abilities
without additional tuning specific to NL2SQL tasks. Existing LLMs-based NL2SQL
approaches try to improve the translation by enhancing the LLMs with an
emphasis on user intention understanding. However, LLMs sometimes fail to
generate appropriate SQL due to their lack of knowledge in organizing complex
logical operator composition. A promising method is to input the LLMs with
demonstrations, which include known NL2SQL translations from various databases.
LLMs can learn to organize operator compositions from the input demonstrations
for the given task. In this paper, we propose PURPLE (Pre-trained models
Utilized to Retrieve Prompts for Logical Enhancement), which improves accuracy
by retrieving demonstrations containing the requisite logical operator
composition for the NL2SQL task on hand, thereby guiding LLMs to produce better
SQL translation. PURPLE achieves a new state-of-the-art performance of 80.5
exact-set match accuracy and 87.8
set of the popular NL2SQL benchmark Spider. PURPLE maintains high accuracy
across diverse benchmarks, budgetary constraints, and various LLMs, showing
robustness and cost-effectiveness.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要