O-glycosylation site prediction for Homo sapiens by combining properties and sequence features with support vector machine.

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY(2021)

引用 3|浏览0
暂无评分
摘要
O-glycosylation is a protein posttranslational modification important in regulating almost all cells. It is related to a large number of physiological and pathological phenomena. Recognizing O-glycosylation sites is the key to further investigating the molecular mechanism of protein posttranslational modification. This study aimed to collect a reliable dataset on Homo sapiens and develop an O-glycosylation predictor for Homo sapiens, named Captor, through multiple features. A random undersampling method and a synthetic minority oversampling technique were employed to deal with imbalanced data. In addition, the Kruskal-Wallis (K-W) test was adopted to optimize feature vectors and improve the performance of the model. A support vector machine, due to its optimal performance, was used to train and optimize the final prediction model after a comprehensive comparison of various classifiers in traditional machine learning methods and deep learning. On the independent test set, Captor outperformed the existing O-glycosylation tool, suggesting that Captor could provide more instructive guidance for further experimental research on O-glycosylation. The source code and datasets are available at https://github.com/YanZhu06/Captor/.
更多
查看译文
关键词
O-glycosylation,support vector machine,K-W test,PTM,bioinformatics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要