Meta-analysis of metagenomes via machine learning and assembly graphs reveals strain switches in Crohn’s disease

biorxiv(2022)

引用 2|浏览24
暂无评分
摘要
Microbial strains have closely related genomes but may have different phenotypes in the same environment. Shotgun metagenomic sequencing can capture the genomes of all strains present in a community but strain-resolved analysis from shotgun sequencing alone remains difficult. We developed an approach to identify and interrogate strain-level differences in groups of metagenomes. We use this approach to perform a meta-analysis of stool microbiomes from individuals with and without inflammatory bowel disease (IBD; Crohn’s disease, ulcerative colitis; n = 605), a disease for which there are not specific microbial biomarkers but some evidence that microbial strain variation may stratify by disease state. We first developed a machine learning classifier based on compressed representations of complete metagenomes (FracMinHash sketches) and identified genomes that correlate with IBD subtype. To rescue variation that may not have been present in the genomes, we then used assembly graph genome queries to recover strain variation for correlated genomes. Lastly, we developed a novel differential abundance framework that works directly on the assembly graph to uncover all sequence variants correlated with IBD. We refer to this approach as dominating set differential abundance analysis and have implemented it in the [spacegraphcats software package][1]. Using this approach, we identified five bacterial strains that are associated with Crohn’s disease. Our method captures variation within the entire sequencing data set, allowing for discovery of previously hidden disease associations. ### Competing Interest Statement The authors have declared no competing interest. [1]: https://github.com/spacegraphcats/spacegraphcats
更多
查看译文
关键词
metagenomes,crohns disease,assembly graphs,strain,meta-analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要