Just Add Data: Automated Predictive Modeling and BioSignature Discovery

biorxiv(2020)

引用 19|浏览19
暂无评分
摘要
Fully automated machine learning, statistical modelling, and artificial intelligence for predictive modeling is becoming a reality, giving rise to the field of Automated Machine Learning (AutoML). AutoML systems promise to democratize data analysis to non-experts, drastically increase productivity, improve replicability of the statistical analysis, facilitate the interpretation of results, and shield against common methodological analysis pitfalls. We present the basic ideas and principles of Just Add Data Bio (JADBIO), an AutoML technology applicable to the low-sample, high-dimensional omics data that arise in translational medicine and bioinformatics applications. In addition to predictive and diagnostic models ready for clinical use, JADBIO also returns the corresponding biosignatures, i.e., minimal-size subsets of biomarkers that are jointly predictive of the outcome of interest. A use-case on thymic epithelial tumors is presented, along with an extensive evaluation on 374 public biological datasets. Results show that long-standing challenges with overfitting and overestimation of complex non-linear machine learning pipelines on high-dimensional, low small sample data can be overcome.
更多
查看译文
关键词
biosignature discovery,predictive modeling,data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要