Semantic query transformations for increased parallelization in distributed knowledge graph query processing

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis(2019)

引用 5|浏览5
暂无评分
摘要
Ontologies have become an increasingly popular semantic layer for integrating multiple heterogeneous datasets. However, significant challenges remain with supporting efficient and scalable processing of queries with data linked with ontologies (ontological queries). Ontological query processing queries requires explicitly defined query patterns be expanded to capture implicit ones, based on available ontology inference axioms. However, in practice such as in the biomedical domain, the complexity of the ontological axioms results in significantly large query expansions which present day query processing infrastructure cannot support. In particular, it remains unclear how to effectively parallelize such queries. In this paper, we propose data and query transformations that enable inter-operator parallelism of ontological queries on Hadoop platforms. Our transformation techniques exploit ontological axioms, second order data types and operator rewritings to eliminate expensive query substructures for increased parallelizability. Comprehensive experiments conducted on benchmark datasets show up to 25X performance improvement over existing approaches.
更多
查看译文
关键词
algorithms, and data management, and resource management including dynamic resource provisioning, cloud workflow, data, data analytics and frameworks supporting data analytics, graph and network algorithms, improved models, metadata, namespaces, performance or scalability of specific applications and respective software, scalable storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要