Graph summarization for indexing paths in graph-structured data

Graph summarization for indexing paths in graph-structured data(2003)

引用 23|浏览12
暂无评分
摘要
With the rapidly increasing popularity of XML for data representation, there is a lot of interest in query processing over data that conforms to a labeled-tree or labeled-graph data model. An important component in querying such data involves traversing the labeled-graph by forming path expressions. This thesis studies auxiliary data structures known as path indexes that are intended to speed up the evaluation of path expressions. The approach adopted towards path indexing in this thesis is to construct a smaller summary graph and then evaluate path expressions over this smaller graph. We first describe formally why this problem of path indexing is different from the traditional indexing problem in database systems. Next, we propose an index specification framework that can be used to define a wide variety of path indexes, each covering a different set of path expressions. The techniques used in this framework are such that for a large class of index specifications, the resulting path index is the smallest index that is suitable for the respective set of path expressions. These techniques are based on the notion of graph bisimilarity. We then study how XML queries can be processed in a native XML database management system using path indexes in conjunction with inverted lists. We also analyze how this integration of path indexing and inverted lists can be used to answer information retrieval style relevance-based queries. The algorithms we obtain in this context have the property of instance optimality, a notion of optimality recently introduced by Fagin et al. in the published literature. Finally, in the last part of this thesis, we study how the path indexes we propose can be maintained as the underlying data changes.
更多
查看译文
关键词
graph-structured data,resulting path index,inverted list,path indexing,labeled-graph data model,path index,path expression,index specification,indexing path,graph summarization,underlying data change,thesis studies auxiliary data,data representation,indexation,structured data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要