Human Multi-omics Data Pre-processing for Predictive Purposes Using Machine Learning: A Case Study in Childhood Obesity

Bioinformatics and Biomedical Engineering(2022)

引用 2|浏览14
暂无评分
摘要
The Machine Learning applications in the medical field using omics data are countless and promising, highlighting the possibility of creating long-term predictive models for highly prevalent diseases. Nevertheless, to take advantage of the virtues of omics data and machine learning tools, we first need to perform adequate data pre-processing just as taking some considerations for the constructions of the models. The present paper is an example of how to face the main challenges encountered when constructing machine learning predictive models with multi-omics human data. Some topics covered in this work include a description of the main particularities of each omics data layer, the most appropriate pre-processing approaches for each source, and a collection of good practices and tips for applying machine learning to this kind of data with predictive purposes. Using real data examples (blood samples), we illustrate how some of the key issues are addressed in this kind of research (technical noise, biological heterogeneity, class imbalance, high dimensionality, and presence of missing values, among others). Additionally, we set the basis for future work incorporating some proposals to improve models, arguing their need according to encountered insights.
更多
查看译文
关键词
Multi-omics, Data pre-processing, Machine learning, eXplainable Artificial Intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要