Can Automated Metadata Extraction Make Scientific Data More Navigable?

2023 IEEE 19th International Conference on e-Science (e-Science)(2023)

引用 0|浏览16
暂无评分
摘要
FAIR principles require that scientific data be findable, discoverable, and reusable by users. To enable FAIRness, practioners of a science repository will often construct a rich, searchable index of metadata derived from the data. Unfortunately, manual metadata annotation methods do not scale to the many data files generated by many projects; and instead automated extraction systems are needed to scalably parse these files—often with nonstandard schema requiring specialized parsing strategies—and deposit representative metadata into a search index. In this work, we evaluate whether, and the extent to which, automatically extracted metadata make research repositories more navigable. We present a two-part user study conducted with scientists at two U.S. national laboratories from projects spanning spectroscopy and battery modeling. We constructed research indexes automatically by using the Xtract metadata extraction system. In the first part of our study, we learned about each user's role and identified key navigation concerns for scientists. We found that participants wished to navigate for purposes of discovery, retrieval, and organization. In the second part, participants completed simulated research data navigation tasks crafted to reflect real-world navigability concerns. We found that regardless of the interface used, participants consistently solved navigation tasks with high degrees of confidence and correctness, and significantly ( $1.2\mathrm{X}-50\times$ ) faster than via their alternative methods (e.g., manual directory scans or designing a customized navigational tool).
更多
查看译文
关键词
metadata,file storage,information extraction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要