ortho_seqs: A Python tool for sequence analysis and higher order sequence-phenotype mapping

biorxiv(2022)

引用 0|浏览11
暂无评分
摘要
Motivation: An important goal in sequence analysis is to understand how parts of DNA, RNA, or protein sequences interact with each other and to predict how these interactions result in given phenotypes. Mapping phenotypes onto underlying sequence space at first- and higher order levels in order to independently quantify the impact of given nucleotides or residues along a sequence is critical to understanding sequence-phenotype relationships. Results: ​​We developed a Python software tool, ortho\_seqs, that quantifies higher order sequence-phenotype interactions based on our previously published method of applying multivariate tensor-based orthogonal polynomials to biological sequences. Using this method, nucleotide or amino acid sequence information is converted to vectors, which are then used to build and compute the first- and higher order tensor-based orthogonal polynomials and bases. We derived a more complete version of the mathematical method that includes projections that not only quantify effects of given nucleotides at a particular site, but also identify the effects of nucleotide substitutions. We show proof of concept of this method, provide a use case example as applied to synthetic antibody sequences, and demonstrate the application of ortho\_seqs to other sequence-phenotype datasets. Availability: The tool is a packaged command-line utility, installable via PyPI or through GitHub at https://github.com/snafees/ortho_seqs It is accompanied by an easy-to-use graphical user interface (GUI) along with extensive documentation at https://ortho-seqs.readthedocs.io ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
ortho_seqs,sequence analysis,higher order sequence,python tool
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要