Improving conditional random field model for prediction of protein-RNA residue-base contacts

Quantitative Biology(2018)

引用 2|浏览17
暂无评分
摘要
Background For understanding biological cellular systems, it is important to analyze interactions between protein residues and RNA bases. A method based on conditional random fields (CRFs) was developed for predicting contacts between residues and bases, which receives multiple sequence alignments for given protein and RNA sequences, respectively, and learns the model with many parameters involved in relationships between neighboring residue-base pairs by maximizing the pseudo likelihood function. Methods In this paper, we proposed a novel CRF-based model with more complicated dependency relationships between random variables than the previous model, but which takes less parameters for the sake of avoidance of overfitting to training data. Results We performed cross-validation experiments for evaluating the proposed model, and took the average of AUC (area under receiver operating characteristic curve) scores. The result suggests that the proposed CRF-based model without using L 1 -norm regularization (lasso) outperforms the existing model with and without the lasso under several input observations to CRFs. Conclusions We proposed a novel stochastic model for predicting protein-RNA residue-base contacts, and improved the prediction accuracy in terms of the AUC score. It implies that more dependency relationships in a CRF could be controlled by less parameters.
更多
查看译文
关键词
protein-RNA interaction,residue-base contact,conditional random field
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要