Deep Manifold Transformation for Protein Representation Learning
CoRR(2024)
摘要
Protein representation learning is critical in various tasks in biology, such
as drug design and protein structure or function prediction, which has
primarily benefited from protein language models and graph neural networks.
These models can capture intrinsic patterns from protein sequences and
structures through masking and task-related losses. However, the learned
protein representations are usually not well optimized, leading to performance
degradation due to limited data, difficulty adapting to new tasks, etc. To
address this, we propose a new deep manifold
transformation approach for universal protein
representation learning (DMTPRL). It employs manifold
learning strategies to improve the quality and adaptability of the learned
embeddings. Specifically, we apply a novel manifold learning loss during
training based on the graph inter-node similarity. Our proposed DMTPRL method
outperforms state-of-the-art baselines on diverse downstream tasks across
popular datasets. This validates our approach for learning universal and robust
protein representations. We promise to release the code after acceptance.
更多查看译文
关键词
Protein representation learning,sequence,structure,manifold learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要