SimE4KG: Distributed and Explainable Multi-Modal Semantic Similarity Estimation for Knowledge Graphs

2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)(2023)

引用 0|浏览30
暂无评分
摘要
In recent years, exciting sources of data have been modeled as knowledge graphs (KGs). This modeling represents both structural relationships and the entity-specific multi-modal data in KGs. In various data analytics pipelines and machine learning (ML), the task of semantic similarity estimation plays a significant role. Assigning similarity values to entity pairs is needed in recommendation systems, clustering, classification, entity matching/disambiguation and many others. Efficient and scalable frameworks are needed to handle the quadratic complexity of all-pair semantic similarity on Big Data KGs. Moreover, heterogeneous KGs demand multi-modal semantic similarity estimation to cover the versatile contents like categorical relations between classes or their attribute literals like strings, timestamps or numeric data. In this paper, we propose the SimE4KG framework as a resource providing generic open-source modules that perform semantic similarity estimation in multi-modal KGs. To justify the computational costs of similarity estimation, the SimE4KG generates reproducible, reusable and explainable results. The pipeline results are a native semantic RDF KG, including the experiment results, hyper-parameter setup and explanation of the results, like the most influential features. For fast and scalable execution in memory, we implemented the distributed approach using Apache Spark. The entire development of this framework is integrated into the holistic distributed Semantic ANalytics StAck (SANSA).
更多
查看译文
关键词
Semantic similarity,knowledge graphs,distributed computing,explainable artificial intelligence,scalable semantic processing,RDF,Apache Spark,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要