A multi-task positive-unlabeled learning framework to predict secreted proteins in human body fluids

COMPLEX & INTELLIGENT SYSTEMS(2024)

引用 0|浏览1
暂无评分
摘要
Body fluid biomarkers are very important, because they can be detected in a non-invasive or minimally invasive way. The discovery of secreted proteins in human body fluids is an essential step toward proteomic biomarker identification for human diseases. Recently, many computational methods have been proposed to predict secreted proteins and achieved some success. However, most of them are based on a manual negative dataset, which is usually biased and therefore limits the prediction performances. In this paper, we first propose a novel positive-unlabeled learning framework to predict secreted proteins in a single body fluid. The secreted protein discovery in a single body fluid is transformed into multiple binary classifications and solved via multi-task learning. Also, an effective convolutional neural network is employed to reduce the overfitting problem. After that, we then improve this framework to predict secreted proteins in multiple body fluids simultaneously. The improved framework adopts a globally shared network to further improve the prediction performances of all body fluids. The improved framework was trained and evaluated on datasets of 17 body fluids, and the average benchmarks of 17 body fluids achieved an accuracy of 89.48%, F1 score of 56.17%, and PRAUC of 58.93%. The comparative results demonstrate that the improved framework performs much better than other state-of-the-art methods in secreted protein discovery.
更多
查看译文
关键词
Secreted protein discovery,Semi-supervised learning,Convolutional neural network,Multi-task learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要