A novel framework for the efficient evaluation of hybrid tree-pattern queries on large data graphs.

Inf. Syst.(2023)

引用 0|浏览7
暂无评分
摘要
Graph pattern matching is a fundamental operation for exploring and analyzing large graphs, which are widely used to represent entities and their relationships in a plethora of applications. However, existing algorithms do not perform satisfactorily as they generate a large number of intermediate results. In this paper, we introduce a novel framework to address the problem of efficiently finding homomorphic matches of hybrid tree-patterns over large graphs. These are patterns that can include two types of edges: child edges (which are mapped to edges in the data graph) and/or descendant edges (which are mapped to paths in the data graph), thus allowing for higher expressiveness and flexibility in query formulation. We introduce the concept of answer graph to compactly represent the query results and exploit computation sharing. Leveraging the answer graph, our approach is able to avoid the generation of intermediate results not participating in the query answer, and can produce the query answer in time linear on the size of the input data graph and the output. Moreover, the answer graph allows query counting without explicitly enumerating query results. We design a bottom-up algorithm (BUP) and a simulation-based algorithm (SIM) for building the answer graph, each having their own strengths in different cases. We also present two algorithms for enumerating the results using the answer graph. Our extensive experimental evaluation on real and synthetic datasets shows that our proposed techniques greatly outperform the state-of-the-art graph matching algorithms as well as a popular graph DBMS on evaluating various hybrid tree-pattern queries. Our source code, datasets and queries are publicly available at https://github.com/wuxyng/TwigMatchOnGraph. & COPY; 2023 Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
queries,tree-pattern
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要