Curation of the Deep Green list of unannotated green lineage proteins to enable structural and functional characterization

biorxiv(2022)

引用 0|浏览6
暂无评分
摘要
An explosion of sequenced genomes and predicted proteomes enabled by low cost deep sequencing has revolutionized biology. Unfortunately, protein functional annotation is more complex, and has not kept pace with the sequencing revolution. We identified unannotated proteins in three model organisms representing distinct parts of the green lineage (Viridiplantae); Arabidopsis thaliana (dicot), Setaria viridis (monocot), and Chlamydomonas reinhardtii (Chlorophyte alga). Using similarity searching we found the subset of unannotated proteins that were conserved between these species and defined them as Deep Green proteins. Informatic, genomic, and structural predictions were leveraged to begin inferring functional information about Deep Green genes and proteins. The Deep Green set was enriched for proteins with predicted chloroplast targeting signals that are predictive of photosynthetic or plastid functions. Strikingly, structural predictions using AlphaFold and comparisons to known structures show that a significant proportion of Deep Green proteins may possess novel protein tertiary structures. The Deep Green genes and proteins provide a starting resource of high value targets for further investigation of potentially new protein structures and functions that are conserved in the green lineage. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
unannotated green lineage proteins,deep green list
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要