An open dataset of data lineage graphs for data governance research

Visual Informatics(2024)

引用 0|浏览1
暂无评分
摘要
Data have become valuable assets for enterprises. Data governance aims to manage and reuse data assets to facilitate enterprise management and product innovations. A data lineage graph (DLG) is an abstracted collection of data assets and their data lineages in data governance. Analyzing DLGs can provide rich data insights for data governance. However, the progress of data governance technologies is hindered by the shortage of available open datasets for DLGs. This paper introduces an open dataset of DLGs, including the DLG model, the dataset construction process, and applied areas. This real-world dataset is sourced from Huawei Cloud Computing Technology Company Limited, which contains 18 DLGs with three types of data assets and two types of relations. To the best of our knowledge, this dataset is the first open dataset of DLGs for data governance. This dataset can also support the development of other application areas, such as graph analytics and visualization.
更多
查看译文
关键词
Data asset,Data governance,Data lineage,Graph,Open dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要