Graph Embedding Based Code Search In Software Project

INTERNETWARE'18: PROCEEDINGS OF THE TENTH ASIA-PACIFIC SYMPOSIUM ON INTERNETWARE(2018)

引用 6|浏览36
暂无评分
摘要
Source code search is one of the most important methods to study and reuse software project. Currently, natural language based code search mainly faces the following two challenges: 1) More accurate search results are required when software projects evolve to be more heterogeneous and complex. 2) The semantic relationships between code elements (classes, methods, etc.) need to be illustrated so that developers could better understand their usage scenarios. To deal with these issues, we propose a novel approach to searching a software project's source code based on graph embedding. First, we build a software project's code graph automatically from its source code and represent each code element in the code graph with graph embedding. Second, we search code graph with natural language questions, return corresponding subgraph that composed of relevant code elements and their associated relationships, as the best answer of the search question. In experiments, we select two famous open source projects, Apache Lucene and POI, as examples to perform source code search tasks. The experimental results show that our approach improves F1-score by 10% than existing shortest path based code graph search approach, while reduces average response time about 60 times.
更多
查看译文
关键词
Software reuse, Code search, Code graph, Graph embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要