Deep Knowledge Mining of Complete HIV Genome Sequences in Selected African Cohorts

Deep Learning in Biomedical and Health Informatics(2021)

引用 0|浏览2
暂无评分
摘要
Enormous research intensity hovers around the discovery of a genetic variation of HIV-1 and its effect on clinical management, hence triggering the need for real-time global disease surveillance. In this chapter, we offer a cooperative framework that enables raw HIV genome sequences assembly for efficient feature extraction and integration. We accomplished the framework by transforming publicly available HIV-1 genome sequences from the NCBI database into a numerical equivalent that enables features, pattern mining and cognitive knowledge extraction. Thirty (30) genome sequences each were collected for 10 African countries, making a total of 300 sequences with length ranging from 2,557,800 to 2,956,200 bps. The transformed genome sequences were then learned using a self-organizing map (SOM), to discover inherent sub-strain patterns. Using a natural processing language (NLP) technique a cognitive knowledge algorithm was used to decouple the SOM resulting in a cognitive feature map of sub-strain clusters. These clusters were used to construct the target outputs for supervised learning. Finally, a deep learning of the enriched genome dataset was performed using three activation functions (gaussian, sigmoid, and rectified linear unit: ReLU). SOM and cognitive mining results proved the existence of HIV sub-strains different from the reference genomes in selected African countries and the prospects of improved genome surveillance for efficient contact tracing of infectious diseases. DNN classification results showed the superiority of the ReLU activation function in learning the HIV genome features. The present discovery therefore ignites further investigation into the inter- and intra-country HIV-1 sub-strains transmission.
更多
查看译文
关键词
complete hiv genome sequences,selected african cohorts
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要