BioPrediction-RPI: Democratizing the Prediction of Interaction Between Non-Coding RNA and Protein with End-to-End Machine Learning

Bruno Rafael Florentino,Robson Parmezan Bonidia, Natan Henrique Sanches,Ulisses N. da Rocha,André C.P.L.F. de Carvalho

Computational and Structural Biotechnology Journal(2024)

引用 0|浏览0
暂无评分
摘要
Machine Learning (ML) algorithms have been important tools for the extraction of useful knowledge from biological sequences, particularly in healthcare, agriculture, and the environment. However, the categorical and unstructured nature of these sequences requiring usually additional feature engineering steps, before an ML algorithm can be efficiently applied. The addition of these steps to the ML algorithm creates a processing pipeline, known as end-to-end ML. Despite the excellent results obtained by applying end-to-end ML to biotechnology problems, the performance obtained depends on the expertise of the user in the components of the pipeline. In this work, we propose an end-to-end ML-based framework called BioPrediction-RPI, which can identify implicit interactions between sequences, such as pairs of non-coding RNA and proteins, without the need for specialized expertise in end-to-end ML. This framework applies feature engineering to represent each sequence by structural and topological features. These features are divided into feature groups and used to train partial models, whose partial decisions are combined into a final decision, which, provides insights to the user by giving an interpretability report. In our experiments, the developed framework was competitive when compared with various expert-created models. We assessed BioPrediction-RPI with 12 datasets when it presented equal or better performance than all tools in 40% to 100% of cases, depending on the experiment. Finally, BioPrediction-RPI can fine-tune models based on new data and perform at the same level as ML experts, democratizing end-to-end ML and increasing its access to those working in biological sciences.
更多
查看译文
关键词
End-to-end ML,democratizing ML,RNA-Protein interaction,interaction prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要