Epidemiological associations with genomic variation in SARS-CoV-2

Research Square (Research Square)(2021)

引用 0|浏览3
暂无评分
摘要
Abstract The 2019 novel coronavirus (SARS-CoV-2) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. To diminish the short-term and long-term impacts of coronavirus (CoV), we investigated CoV differences at the nucleotide and protein level and CoV genomic variation associated with epidemiological variation and geography. We divided the CoV genome into 29 constituent regions for this analysis. Our results highlight the variation of CoV variants of lineage and show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation, which makes these two proteins potential targets for treatments. S protein variation is highly correlated with nsp3, nsp6, and 3'−to−5' exonuclease. Country of origin and time since the start of the pandemic were the most influential metadata in these differences. Host sex and age are the lowest in terms of explaining the virus genome variation. We quantified variation explained by regions of the CoV genome across different CoV viruses including, SARS-CoV-2, Middle East respiratory syndrome coronavirus (MERS-CoV), other severe acute respiratory syndrome coronavirus SARS-CoV (SARS-related), and bat-derived severe acute respiratory syndrome (SARS)-like coronaviruses (Bat-SL-CoV). We found that Spike protein and nsp3 explain most of the variation among these viruses; they are also among the genomic regions with the highest number of sites under natural selection. Our results provide a direction to prioritize genes associated with outcome predictors, including health, therapeutic, and vaccine outcomes, and to inform improved DNA tests for predicting disease status.
更多
查看译文
关键词
genomic variation,epidemiological associations,sars-cov
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要