Label-Free Data Mining of Scientific Literature by Unsupervised Syntactic Distance Analysis

JOURNAL OF PHYSICAL CHEMISTRY LETTERS(2023)

引用 0|浏览16
暂无评分
摘要
Label-free data mining can efficiently feed large amounts of data from the vast scientific literature into artificial intelligence (AI) processing systems. Here, we demonstrate an unsupervised syntactic distance analysis (SDA) approach that is capable of mining chemical substances, functions, properties, and operations without annotation. This SDA approach was evaluated in several areas of research from the physical sciences and achieved performance in information mining comparable to that of supervised learning, as shown by its satisfactory scores of 0.62-0.72, 0.60-0.82, and 0.86-0.95 in precision, recall, and accuracy, respectively. We also showcase how our approach can assist robotic chemists programmed to perform research focused on double-perovskite colloidal nanocrystals, gold colloidal nanocrystals, oxygen evolution reaction catalysts, and enzyme-like catalysts by designing materials, formulations, and synthesis parameters based on data mined from 1.1 million literature references.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要