MAD HATTER correctly annotates 98% of small molecule tandem mass spectra searching in PubChem

biorxiv(2022)

引用 3|浏览3
暂无评分
摘要
Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics usually relies on mass spectrometry, a technology capable of detecting thousands of compounds in a biological sample. Metabolite annotation is executed using tandem mass spectrometry. Spectral library search is far from comprehensive, and numerous compounds remain unannotated. So-called in silico methods allow us to overcome the restrictions of spectral libraries, by searching in much larger molecular structure databases. Yet, after more than a decade of method development, in silico methods still do not reach correct annotation rates that users would wish for. Here, we present a novel computational method called MAD HATTER for this task. MAD HATTER combines CSI:FingerID results with information from the searched structure database via a metascore. Compound information includes the melting point, and the number words in the compound description starting with the letter 'u'. We then show that MAD HATTER reaches a stunning 97.6% correct annotations when searching PubChem, one of the largest and most comprehensive molecular structure databases. Finally, we explain what evaluation glitches were necessary for MAD HATTER to reach this annotation level, what is wrong with similar metascores in general, and why metascores may screw up not only method evaluations but also the analysis of biological experiments. ### Competing Interest Statement MAH, ML and SB are founders of Bright Giant GmbH.
更多
查看译文
关键词
database search,in silico methods,metabolite annotation,metascores,molecular structure,parody paper,tandem mass spectrometry
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要