AMP-EF: An Ensemble Framework of Extreme Gradient Boosting and Bidirectional Long Short-Term Memory Network for Identifying Antimicrobial Peptides

Match(2024)

引用 0|浏览2
暂无评分
摘要
In recent years, bacterial resistance becomes a serious problem due to the abuse of antibiotics. Antimicrobial peptides (AMPs) have rapidly emerged as the best alternative to antibiotics because of their ability to rapidly target bacteria, fungi, viruses, and cancer cells and counteract the toxins they produce. In this study, a two-branch ensemble framework is proposed to identify AMPs, which integrates extreme gradient boosting (XGBoost) and bidirectional long short-term memory network (Bi-LSTM) with attention mechanism to form a stronger model. First, one-hot coding and k-mer are used to represent the sequence features. Then, the feature vectors are input into the two base classifiers respectively to obtain two predicted values. Finally, the prediction results are obtained by compromise. As one of the classical machine learning methods, XGBoost has strong stability and can adapt to datasets of different sizes. Bi-LSTM recurses for each peptide from N-terminal to C-terminal and C-terminal to N-terminal, respectively. As the context information is provided, the model can make more accurate prediction. Our method achieves higher or highly comparable results across the eight independent test datasets. The ACC values of XUAMP, YADAMP, DRAMP, CAMP, LAMP, APD3, dbAMP, and DBAASP are 77.9%, 98.5%, 72.5%, 99.8%, 83.0%, 92.4%, 87.5%, and 84.6%, respectively. This shows that the two-branch ensemble structure is feasible and has strong generalization. The codes and datasets are accessible at https://github.com/z11code/AMP-EF.
更多
查看译文
关键词
antimicrobial peptides,ensemble framework,extreme gradient boosting,long short-term short-term memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要