Querying Semantic Knowledge Bases with SQL-on-Hadoop.

BeyondMR@SIGMOD(2017)

引用 5|浏览84
暂无评分
摘要
The constant growth of semantically-annotated data and an increasing interest in cross-domain knowledge bases raises the need for expressive query languages for RDF and novel approaches that enable their evaluation for web-scale data sizes. However, SPARQL, the W3C standard query language for RDF, suffers from a rather limited capability to express navigational queries. More expressive languages have been theoretically studied, however not implemented. In this paper, we continue our work on TRIAL-QL, an expressive (SQL-like) RDF query language based on the Triple Algebra with Recursion [31]. We present a new version of our TRIAL-QL processor, which takes advantage of the current momentum in in-memory SQL-on-Hadoop solutions and is built on top of Impala and SPARK while using one unified data storage. We use our system to study the application of multiple evaluation algorithms, storage strategies and optimizations on Impala and SPARK while highlighting their properties. Comprehensive experiments examine the performance of our system in comparison to other competitive RDF management systems. The obtained results demonstrate its suitability for querying semantic knowledge bases by providing interactive query response times for selective queries on datasets with more than one billion triple. More data-intensive use-cases that produce, e.g. over 25 billion results finished in the order of minutes.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要