The accurate prediction of protein family from amino acid sequence by measuring features of sequence fragments.

JOURNAL OF COMPUTATIONAL BIOLOGY(2010)

引用 15|浏览25
暂无评分
摘要
The rapid advances in proteomic analyses coupled with the completion of multiple genomes have led to an increased demand for determining protein functions. The first step is classification or prediction into families. A method was developed for the prediction of protein family based only on protein sequence using support vector machine (SVM) models. In these models, the amino acids were classified into three categories (apolar, polar, and charged). Consecutive fragments ranging from one to five were annotated by amino acid type to define the protein features of each protein. SVM models were constructed based on the protein features of a training set of proteins and then examined with an independent set of proteins. The approach was tested for 20 protein families from the iProClass database of Protein Information Resources (PIR). For two-class SVM models, an average prediction accuracy of 0.9985 was achieved, while for multi-class SVM models an accuracy of 0.9941 was achieved. This study demonstrates that SVM based methods can accurately recognize and predict the protein family to which a sequence belongs based solely on its primary amino acid sequence.
更多
查看译文
关键词
prediction,protein family,protein feature,recognition,sequence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要