Reasoning and querying web-scale open data based on DL-LiteAin a divide-and-conquer way

Journal of Web Semantics(2019)

引用 2|浏览0
暂无评分
摘要
Abstract We propose to use DL-Lite A techniques to reason and query the Web-scale Open Data (knowledge bases) described by Semantic Web standards like RDF and OWL due to the low reasoning complexity and suitable expressivity of the language. When facing the real-life scalability challenge, the actual reasoning and query answering may become infeasible by the following two factors. Firstly, for both satisfiability checking and conjunctive query answering, a polynomial size of queries may need to be answered over the data layers of the corresponding knowledge bases (KBs) w.r.t. the size of the schema knowledge of these KBs. Secondly, for KBs with massive individual assertions, evaluating a single query over the data layers may be highly time-consuming. This impels us to seek for a divide-and-conquer reasoning and query answering approach for DL-Lite A , with the basic idea of partitioning both KBs and queries into smaller chunks and decomposing the original reasoning and query answering tasks into a group of independent sub-tasks such that the overall performance can be improved by taking advantage of parallelization and distribution techniques. The challenge for designing such an approach lies in how to carry out partitioning and reasoning reduction in a sound and complete way. Motivated by hash partitioning of RDF graphs, we expect the smaller KB chunks to have the local feature for both satisfiability checking and simple-query answering. Here simple-queries are the conjunctive queries whose query atoms share a common variable or individual. For query answering, we expect to partition a query into smaller simple-queries and evaluate them over smaller KB chunks. Under these expectations, our divide-and-conquer approach is constructed from both theoretical and practical perspectives. Theoretically, definitions of KB partitions and query partitions are presented, and the sufficient and necessary conditions are identified to determine whether a KB partition holds the desired features. Practically, based on the theoretical results, the concrete ways of partitioning KBs and queries as well as evaluating query partitions over KB partitions are described. Moreover, a strategy of optimizing the procedure of evaluating query partitions over KB partitions is provided to improve the overall query answering performance. To verify our approach, two Web-scale open datasets, DBpedia and BTC 2012 dataset, have been chosen. The empirical results indicate that the provided approach opens new possibilities for realizing performance-critical applications on the Web with both high expressivity and scalability.
更多
查看译文
关键词
open data,web-scale,dl-liteain,divide-and-conquer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要