Toward FAIR Semantic Publishing of Research Dataset Metadata in the Open Research Knowledge Graph
arxiv(2024)
摘要
Search engines these days can serve datasets as search results. Datasets get
picked up by search technologies based on structured descriptions on their
official web pages, informed by metadata ontologies such as the Dataset content
type of schema.org. Despite this promotion of the content type dataset as a
first-class citizen of search results, a vast proportion of datasets,
particularly research datasets, still need to be made discoverable and,
therefore, largely remain unused. This is due to the sheer volume of datasets
released every day and the inability of metadata to reflect a dataset's content
and context accurately. This work seeks to improve this situation for a
specific class of datasets, namely research datasets, which are the result of
research endeavors and are accompanied by a scholarly publication. We propose
the ORKG-Dataset content type, a specialized branch of the Open Research
Knowledge Graoh (ORKG) platform, which provides descriptive information and a
semantic model for research datasets, integrating them with their accompanying
scholarly publications. This work aims to establish a standardized framework
for recording and reporting research datasets within the ORKG-Dataset content
type. This, in turn, increases research dataset transparency on the web for
their improved discoverability and applied use. In this paper, we present a
proposal – the minimum FAIR, comparable, semantic description of research
datasets in terms of salient properties of their supporting publication. We
design a specific application of the ORKG-Dataset semantic model based on 40
diverse research datasets on scientific information extraction.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要