SPPPred: Sequence-based Protein-Peptide binding residue Prediction using genetic programming and ensemble learning

IEEE/ACM Transactions on Computational Biology and Bioinformatics(2022)

引用 1|浏览2
暂无评分
摘要
Peptide-binding proteins play significant roles in various applications such as gene expression, metabolism, signal transmission, DNA (Deoxyribose Nucleic Acid) repair, and replication. Investigating the binding residues in protein-peptide complexes, especially from their sequence only, is challenging experimentally and computationally. Although several computational approaches have been introduced to determine and predict these binding residues, there is still ample room to improve the prediction performance. In this work, we introduce a novel ensemble machine learning-based approach called SPPPred (Sequence-based Protein-Peptide binding residue Prediction) to predict protein-peptide binding residues. First, we extract relevant sequential information and employ genetic programming algorithm for feature construction to find more distinctive features. We then, in the next step, build an ensemble-based machine learning classifier to predict binding residues. The proposed method shows consistent and comparable performance on both ten-fold cross-validation and independent test set. Furthermore, SPPPred yields F-Measure (F-M), Accuracy(ACC), and Matthews' Correlation Coefficient (MCC) of 0.310, 0.949, and 0.230 on the independent test set, respectively, which outperforms other competing methods by approximately up to 9% on the independent test set. SPPPred is publicly available.https://github.com/GTaherzadeh/SPPPred.git.
更多
查看译文
关键词
genetic programming,ensemble learning,prediction,sequence-based,protein-peptide
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要