Learned Fingerprint Embedding for Large-Scale Peptide Mass Spectra Retrieval.

Yongshuai Wang, Xiaojun Cai, Defeng Li,Shiwei Sun,Cheng Chen,Xuefeng Cui

2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)(2023)

引用 0|浏览2
暂无评分
摘要
Tandem mass spectrometry (MS/MS) is a widely used technique for protein identification, post-translational modifications, immunotherapy, and other applications. As the amount of MS/MS spectra data increases, new computational methods are needed to efficiently search through these databases. This study introduces MS2VEC, a novel fingerprint embedding model designed to facilitate large-scale retrieval of peptide mass spectra. MS2VEC captures the relationships between distant peaks and incorporates position-aware fingerprint features from all peaks. To do this, dilated convolutions are used to capture remote relationships, and a novel position-aware multi-head attention pooling mechanism is used to abstract fingerprint features. The results demonstrate that MS2VEC achieves a top-1 retrieval accuracy of 0.810, outperforming existing methods by 5.1%. Interestingly, the precursor charge is not essential for the retrieval task, as the spectra itself contains enough information to accurately predict the charge. Additionally, the results suggest that weight-balanced fragment ions and water losses are important contributors to fingerprint features.
更多
查看译文
关键词
tandem mass spectrometry,peptide mass spectra,deep learning,embedding,information retrieval
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要