PhageAI - Bacteriophage Life Cycle Recognition with Machine Learning and Natural Language Processing

biorxiv(2020)

引用 84|浏览2457
暂无评分
摘要
Abstract Background As antibiotic resistance is becoming a major problem nowadays in a treatment of infections, bacteriophages (also known as phages) seem to be an alternative. However, to be used in a therapy, their life cycle should be strictly lytic. With the growing popularity of Next Generation Sequencing (NGS) technology, it is possible to gain such information from the genome sequence. A number of tools are available which help to define phage life cycle. However, there is still no unanimous way to deal with this problem, especially in the absence of well-defined open reading frames. To overcome this limitation, a new tool is definitely needed. Results We developed a novel tool, called PhageAI, that allows to access more than 10 000 publicly available bacteriophages and differentiate between their major types of life cycles: lytic and lysogenic. The tool included life cycle classifier which achieved 98.90% accuracy on a validation set and 97.18% average accuracy on a test set. We adopted nucleotide sequences embedding based on the Word2Vec with Ship-gram model and linear Support Vector Machine with 10-fold cross-validation for supervised classification. PhageAI is free of charge and it is available at https://phage.ai/ . PhageAI is a REST web service and available as Python package. Conclusions Machine learning and Natural Language Processing allows to extract information from bacteriophages nucleotide sequences for lifecycle prediction tasks. The PhageAI tool classifies phages into either virulent or temperate with a higher accuracy than any existing methods and shares interactive 3D visualization to help interpreting model classification results.
更多
查看译文
关键词
PhageAI,bacteriophages,lifecycle,virulent,temperate,Machine Learning,Natural Language Processing,classifier,DNA embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要