RRGcode: Deep hierarchical search-based code generation

JOURNAL OF SYSTEMS AND SOFTWARE(2024)

引用 0|浏览20
暂无评分
摘要
Retrieval-augmented code generation strengthens the generation model by using a retrieval model to select relevant code snippets from a code corpus. The synergy between retrieval and generation ensures that the generated code closely corresponds to the intended functionality. Existing methods simply feed the retrieved results to the generation model. However, if the retrieval corpus contains erroneous or sub-optimal code examples, there is a risk that the model may replicate these mistakes in the generated code. To tackle these problems, we propose RRGcode (Retrieval, Re-ranking, and Generation for code generation), a deep hierarchical search-based code generation framework that fine-tunes initial retrieved code rankings, reducing the risk of replicating errors from the retrieval corpus and enhancing the generation of higher-quality, more reliable code. Specifically, it first retrieves relevant code candidates from a large code corpus. Then, a reranking model reconstructs the search space through a detailed semantic comparison between code candidates and the query, ensuring that only the most relevant and accurate candidates are considered. Finally, the reranked top -K codes, along with the query, serve as input for the code generation model. Extensive experiments are conducted to evaluate the effectiveness of generated code by RRGcode, demonstrating state-of-the-art performance in code generation tasks.
更多
查看译文
关键词
Hierarchical search,Re-ranking,Semantic comparison,Code generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要