Transfer Learning Using Ensemble Neural Nets for Organic Solar Cell Screening

arxiv(2019)

引用 25|浏览77
暂无评分
摘要
Organic Solar Cells are a promising technology for solving the clean energy crisis in the world. However, generating candidate chemical compounds for solar cells is a time-consuming process requiring thousands of hours of laboratory analysis. For a solar cell, the most important property is the power conversion efficiency which is dependent on the highest occupied molecular orbitals (HOMO) values of the donor molecules. Recently, machine learning techniques have proved to be very effective in building predictive models for HOMO values of donor structures of Organic Photovoltaic Cells (OPVs). Since experimental datasets are limited in size, current machine learning models are trained on calculations based on density functional theory (DFT). Molecular line notations such as SMILES or InChI are popular input representations for describing molecular structure of donor molecules. The two types of line representations encode different information, such as SMILES defines the bond types while InChi defines protonation. In this work, we present an ensemble deep neural network architecture, called SINet, which harnesses both the SMILES and InChI molecular representations to predict HOMO values, and leverage the potential of transfer learning from a large DFT-computed dataset- Harvard CEP to build more robust predictive models for relatively smaller HOPV datasets. Harvard CEP dataset contains molecular structures and properties for 2.3 million candidate donor structures for OPV while HOPV contains DFT-computed and experimental values of 350 and 243 molecules respectively. Our results demonstrate significant performance improvement from use of transfer learning as well as from leveraging both molecular representations.
更多
查看译文
关键词
transfer learning,ensemble neural networks,organic solar cell screening,clean energy crisis,candidate chemical compounds,time-consuming process,power conversion efficiency,machine learning techniques,HOMO values,experimental datasets,current machine learning models,density functional theory,molecular line notations,SMILES,ensemble deep neural network architecture,InChI molecular representations,sizeable DFT-computed dataset- Harvard CEP,Harvard CEP dataset,predictive models,organic photovoltaic cells,molecular orbitals values,HOPV datasets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要