Ranking on Very Large Knowledge Graphs

Proceedings of the 30th ACM Conference on Hypertext and Social Media(2019)

引用 0|浏览53
暂无评分
摘要
Ranking plays a central role in a large number of applications driven by RDF knowledge graphs. Over the last years, many popular RDF knowledge graphs have grown so large that rankings for the facts they contain cannot be computed directly using the currently common 64-bit platforms. In this paper, we tackle two problems: Computing ranks on such large knowledge bases efficiently and incrementally. First, we present ðare, a distributed approach for computing ranks on very large knowledge graphs. ðare assumes the random surfer model and relies on data partitioning to compute matrix multiplications and transpositions on disk for matrices of arbitrary size. Moreover, the data partitioning underlying ðare allows the execution of most of its steps in parallel. As very large knowledge graphs are often updated periodically, we tackle the incremental computation of ranks on large knowledge bases as a second problem. We address this problem by presenting \ihare, an approximation technique for calculating the overall ranking scores of a knowledge without the need to recalculate the ranking from scratch at each new revision. We evaluate our approaches by calculating ranks on the $3 \times 10^9$ and $2.4 \times 10^9$ triples from Wikidata resp. LinkedGeoData. Our evaluation demonstrates that ðare is the first holistic approach for computing ranks on very large RDF knowledge graphs. In addition, our incremental approach achieves a root mean squared error of less than $10^-7 $ in the best case. Both ðare and \ihare are open-source and are available at: \urlhttps://github.com/dice-group/incrementalHARE.
更多
查看译文
关键词
knowledge graphs, random surfer model, ranking rdf
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要