Plogs: Materializing Datalog Programs With Mapreduce For Scalable Reasoning
2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD)(2016)
摘要
With the rapid growth of semantic data, scalable reasoning has attracted more and more attention. However, most existing works about scalable reasoning focus only on RDFS/OWL ter Horst semantics, which are small fragments of OWL 2 RL, and have limitation in expressivity. As OWL 2 RL semantics extended with SWRL rules can be expressed by datalog language, materialization of datalog programs is widely adopted in traditional reasoners. In this paper, we propose a dependency-aware approach on parallel materialization of datalog programs for scalable reasoning. We first present an algorithm to automate the translation from a Datalog rule execution into MapReduce jobs, and make several optimizations for the algorithm to speed up the rule evaluation process. Since the rule execution order has significant impact on reasoning performance due to the dependencies among rules. We then propose a sampling-based method to capture rule dependency, and design a dependency-aware strategy to schedule rule evaluation. Finally, we establish a system to evaluate the proposed approach with a series of semantic rule sets on large synthetic and real knowledge bases. The experimental results show that the proposed optimizations have significant effectiveness and our system achieves approximately linear scalability.
更多查看译文
关键词
Semantic Web,Datalog,MapReduce,Parallel Inference
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络