CProtMEDIAS: clustering of amino acid sequences encoded by gene families by MErging and DIgitizing Aligned Sequences

BRIEFINGS IN BIOINFORMATICS(2022)

引用 1|浏览8
暂无评分
摘要
Protein phylogenetic analysis focuses on the evolutionary relationships among related protein sequences and can help researchers infer protein functions and developmental trajectories. With the advent of the big data era, the existing protein phylogenetic methods, including distance matrix and character-based methods, are facing challenges in both running time and application scope. Here, we developed an R package that we call CProtMEDIAS that is useful for protein phylogenetic analysis. In contrast to existing phylogenetic analysis methods, CProtMEDIAS utilizes dimensionality reduction algorithms to digitize multiple sequence alignments and quickly conduct phylogenetic analysis with a large number of amino acid sequences from similarly distant protein families and species. We used CProtMEDIAS to perform a dimensionality reduction, clustering, pseudotime, specific residue and evolutionary trajectory analysis of the plant homeobox superfamily. We found that CProtMEDIAS delivers consistent clustering, fast running and elegant presentation and thus provides powerful new tools and methods for protein clustering and evolutionary analysis.
更多
查看译文
关键词
phylogenetic analysis,amino acid sequence,sequence digitization,dimensionality reduction,developmental trajectory inference
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要