Provenance Data Storage

Peter Macko,Nicolas Ward

mag(2013)

引用 23|浏览0
暂无评分
摘要
Provenance research has generally focused on issues with data collection and organization. Most approaches represent stored provenance data as a directed acyclic graph (DAG), where objects such as files and processes are nodes in the graph and directed edges specify ancestry relationships between them. While there has been some work addressing logical compression of these provenance graphs, efficient physical storage of provenance data remains unaddressed. In approaching this problem, we implemented and evaluated several techniques tailored for provenance storage, which were inspired by existing representations of general semi-structured data. We considered variants of vertical partitioning, PASS, and RDF, varying two kinds of compression. We compared query runtime, disk usage, and data load time across these storage methods. Our results indicate that vertical partitioning performs best in most cases, while the benefit of compression varies by query.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要