EXSCLAIM!: Harnessing materials science literature for self-labeled microscopy datasets

Patterns(2023)

引用 0|浏览3
暂无评分
摘要
This work introduces the EXSCLAIM! toolkit for the automatic extraction, separation, and caption-based natural language annotation of images from scientific literature. EXSCLAIM! is used to show how rule-based natural language processing and image recognition can be leveraged to construct an electron microscopy data set containing thousands of keyword-annotated nanostructure images. Moreover, it is demonstrated how a combination of statistical topic modeling and semantic word similarity comparisons can be used to increase the number and variety of keyword annotations on top of the standard annotations from EXSCLAIM! With large-scale imaging datasets constructed from scientific literature, users are well positioned to train neural networks for classification and recognition tasks specific to microscopy-tasks often otherwise inhibited by a lack of sufficient annotated training data.
更多
查看译文
关键词
materials science literature,self-labeled
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络