Quality control and annotation of variant peptides identified through proteogenomics

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览3
暂无评分
摘要
Abstract Variant peptides resulting from translation of single nucleotide polymorphisms (SNPs) can lead to aberrant or altered protein functions and thus hold translational potential for disease diagnosis, therapeutics and personalized medicine. Variant peptides detected by proteogenomics are fraught with high number of false positives. Class-specific FDR along with ad-hoc post-search filters have been employed to tackle this issue, but there is no uniform and comprehensive approach to assess variant quality. These protocols are mostly manual or tedious, and not accessible across labs. We present a software tool, PgxSAVy, for the quality control of variant peptides. PgxSAVy provides a rigorous framework for quality control and annotations of variant peptides on the basis of (i) variant quality, (ii) isobaric masses, and (iii) disease annotation. PgxSAVy was able to segregate true and false variants with 98.43% accuracy on simulated data. We then used ∼2.8 million spectra (PXD004010 and PXD001468) and identified 12,705 variant PSMs, of which PgxSAVy evaluated 3028 (23.8%), 1409 (11.1%) and 8268 (65.1%) as confident, semi-confident and doubtful respectively. PgxSAVy also annotates the variants based on their pathogenicity and provides support for assisted manual validation. In these datasets, it identified previously found variants as well some novel variants not seen in original studies. The confident variants identified the importance of mutations in glycolysis and gluconeogenesis pathways in Alzheimer’s disease. The analysis of proteins carrying variants can provide fine granularity in discovering important pathways. PgxSAVy will advance personalized medicine by providing a comprehensive framework for quality control and prioritization of proteogenomics variants. Availability PgxSAVy is freely available at https://github.com/anuragraj/PgxSAVy Key Points Variant peptide in proteogenomics have high rates of false positives class-specific FDR is not sufficiently effective, and tedious manual filtering is not scalable We developed PgxSAVy for automated quality control and disease annotation of variant peptides from proteogenomics search results PgxSAVy was validated using simulation data and manually annotated variant PSMs Independent application on large datasets on Alzheimer’s and HEK cell lines demonstrated that PgxSAVy discovered known and novel mutations with important biological roles. Graphical Abstract
更多
查看译文
关键词
variant peptides,annotation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要