Geographical Distribution of Amino Acid Mutations in Human SARS-CoV-2 Orf1ab Poly-Proteins Compared to the Equivalent Reference Proteins from China

crossref(2021)

引用 0|浏览0
暂无评分
摘要
The amino acid mutations among 28,345 poly-protein sequences corresponding to human SARS-CoV-2 orf1AB gene representing the six geographical locations; Africa, Asia, Europe, North America, Oceania and South America were identified by comparing with the equivalent reference poly-protein sequences derived from the first human SARS-CoV-2 genome sequence, reported from Wuhan-Hu-1, China. The mutations were analysed according to the following three datasets; i) 27,956 poly-proteins comprising 7,096 amino acid residues, ii) 373 poly-proteins comprising between 7,051-7,095 amino acid residues and iii) 16 poly-proteins comprising between 7,097-7,099 amino acid residues. In all, 3,204 distinct mutation sites were observed among the poly-proteins comprising 7,096 amino acid residues contributing to ~45% of the poly-protein sequence in SARS-CoV-2 orf1AB gene that have undergone mutations since the outbreak of COVID-19 pandemic disease in December 2019. Fifteen proteins of the poly-protein sequence were associated with mutations and the mutation propensities for the “leader protein”, nsp2, nsp3, nsp6, nsp7, nsp8, endoRNAse proteins was higher (> 1) compared to nsp4, nsp9, nsp10, 3C-like proteinase, RdRp, helicase, 3’-to-5’ exonuclease and 2’-O-ribose methyltransferase proteins. Relatively higher mutation percentages were observed for the RdRp (35.32%), nsp2 (26.42%), nsp3 (11.73%) and helicase (7.88%) proteins, whereas, mutation percentages for the remaining proteins ranged between 0.16% for nsp10 protein to 4.11% for the 3’ -to-5’ exonuclease proteins. Five mutations; T265I in nsp2 protein, T1246I in nsp3, G3278S in 3C-like proteinase, L3606F in nsp6 and P4715L in RdRp were common across all six geographical locations. The P4715L RdRp mutation was predominant in all geographical locations, except Africa, where G5215S mutation was predominant. The maximum number of distinct mutation sites were observed for the nsp3 protein. In 373 orf1AB poly-protein sequences comprising between 7,051-7,095 amino acid residues, deletion mutations were observed that were associated with “leader protein” between positions; 82-86 (GHVMV) and positions 141-143 (KSF). Among 16 orf1AB poly-proteins comprising between 7,097-7,099 amino acid residues, certain insertion mutations were observed that were associated with the nsp2 (517K), nsp3 (938E, 1901Y), 2’ -O-ribose methyltransferase (7046F) and nsp6 (3610F, 3611L) proteins. In this work, all mutations observed among the 28,345 orf1AB poly-proteins of human SARS CoV-2 relative to the reference sequences are presented.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要