Ancestral protein sequence reconstruction using a tree-structured Ornstein-Uhlenbeck variational autoencoder

International Conference on Learning Representations (ICLR)(2022)

引用 1|浏览32
暂无评分
摘要
We introduce a deep generative model for representation learning of biological sequences that, unlike existing models, explicitly represents the evolutionary process. The model makes use of a tree-structured Ornstein-Uhlenbeck process, obtained from a given phylogenetic tree, as an informative prior for a variational autoencoder. We show the model performs well on the task of ancestral sequence reconstruction of single protein families. Our results and ablation studies indicate that the explicit representation of evolution using a suitable tree-structured prior has the potential to improve representation learning of biological sequences considerably. Finally, we briefly discuss extensions of the model to genomic-scale data sets and the case of a latent phylogenetic tree.
更多
查看译文
关键词
biological sequences,variational autoencoders,latent representations,ornstein-uhlenbeck process,evolution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要