Sensitive inference of alignment-safe intervals from biodiverse protein sequence clusters using EMERALD

biorxiv(2023)

引用 1|浏览9
暂无评分
摘要
Abstract Sequence alignments are the foundations of life science research, but most innovation so far focuses on optimal alignments, while information derived from suboptimal solutions is ignored. We argue that one optimal alignment per pairwise sequence comparison is a reasonable approximation when dealing with very similar sequences but is insufficient when exploring the biodiversity of the protein universe at tree-of-life scale. To overcome this limitation, we introduce pairwise alignment-safety to uncover the amino acid positions robustly shared across all suboptimal solutions. We implement EMERALD, a software library for alignment-safety inference, and apply it to 400k sequences from the SwissProt database.
更多
查看译文
关键词
Sequence alignment,Dynamic programming,Needleman-Wunsch algorithm,Protein folding,Suboptimal alignments
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要