kgbench: A Collection of Knowledge Graph Datasets for Evaluating Relational and Multimodal Machine Learning

SEMANTIC WEB, ESWC 2021(2021)

引用 17|浏览19
暂无评分
摘要
Graph neural networks and other machine learning models offer a promising direction for machine learning on relational and multimodal data. Until now, however, progress in this area is difficult to gauge. This is primarily due to a limited number of datasets with (a) a high enough number of labeled nodes in the test set for precise measurement of performance, and (b) a rich enough variety of multimodal information to learn from. We introduce a set of new benchmark tasks for node classification on RDF-encoded knowledge graphs. We focus primarily on node classification, since this setting cannot be solved purely by node embedding models. For each dataset, we provide test and validation sets of at least 1000 instances, with some over 10000. Each task can be performed in a purely relational manner, or with multimodal information. All datasets are packaged in a CSV format that is easily consumable in any machine learning environment, together with the original source data in RDF and pre-processing code for full provenance. We provide code for loading the data into numpy and pytorch. We compute performance for several baseline models.
更多
查看译文
关键词
Knowledge graphs, Machine learning, Message passing models, Multimodal learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要