GraSeq: Graph and Sequence Fusion Learning for Molecular Property Prediction
CIKM '20: The 29th ACM International Conference on Information and Knowledge Management Virtual Event Ireland October, 2020(2020)
摘要
With the recent advancement of deep learning, molecular representation learning -- automating the discovery of feature representation of molecular structure, has attracted significant attention from both chemists and machine learning researchers. Deep learning can facilitate a variety of downstream applications, including bio-property prediction, chemical reaction prediction, etc. Despite the fact that current SMILES string or molecular graph molecular representation learning algorithms (via sequence modeling and graph neural networks, respectively) have achieved promising results, there is no work to integrate the capabilities of both approaches in preserving molecular characteristics (e.g, atomic cluster, chemical bond) for further improvement. In this paper, we propose GraSeq, a joint graph and sequence representation learning model for molecular property prediction. Specifically, GraSeq makes a complementary combination of graph neural networks and recurrent neural networks for modeling two types of molecular inputs, respectively. In addition, it is trained by the multitask loss of unsupervised reconstruction and various downstream tasks, using limited size of labeled datasets. In a variety of chemical property prediction tests, we demonstrate that our GraSeq model achieves better performance than state-of-the-art approaches.
更多查看译文
关键词
Molecular Representation Learning, Sequence Model, Graph Neural Network, Fusion Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络