Data mining antibody sequences for database searching in bottom-up proteomics

Xuan-Tung Trinh, Rebecca Freitag,Konrad Krawczyk,Veit Schwammle

biorxiv(2024)

引用 0|浏览0
暂无评分
摘要
Mass spectrometry (MS)-based proteomics allows identifying and quantifying thousands of proteins but suffers from challenges when measuring human antibodies due to their vast variety. The mainly used bottom-up proteomics approaches rely on database searches that compare experimental values of peptides and their fragments to theoretical values derived from protein sequences in a database. While the human body can produce millions of distinct antibodies, the current databases for human antibodies such as UniProtKB/Swiss-Prot are limited to only 1095 sequences (as of 2024 Jan). This limitation may hinder the identification of new antibodies using mass spectrometry. Therefore, extending the database for mass spectrometry is an important task for discovering new antibodies. Recent genomic studies have compiled millions of human antibody sequences publicly accessible through the Observed Antibody Space (OAS) database. However, this data has yet to be exploited to confirm the presence of these antibodies. In this study, we adopted this extensive collection of antibody sequences for conducting efficient database searches in publicly available proteomics data with a focus on the SARS-CoV-2 disease. Thirty million heavy antibody sequences from 146 SARS-CoV-2 patients in the OAS database were digested in silico to obtain 18 million unique peptides. These peptides were then used to create new databases for bottom-up proteomics. We used those databases for searching new antibody peptides in publicly available SARS-CoV-2 human plasma samples in the Proteomics Identification Database (PRIDE). This approach avoids false positives in antibody peptide identification as confirmed by searching against negative controls (brain samples) and employing different database sizes. We show that the found sequences provide valuable information to distinguish diseased from healthy and expect that the newly discovered antibody peptides can be further employed to develop therapeutic antibodies. The method will be broadly applicable to find characteristic antibodies for other diseases. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要