Extending Tree-Based Automated Machine Learning to Biomedical Image and Text Data Using Custom Feature Extractors

PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION(2023)

引用 0|浏览21
暂无评分
摘要
Automated machine learning (AutoML) has allowed for many innovations in biomedical data science; however, most AutoML approaches do not support image or text data. To rectify this, we implemented four feature extractors in the Tree-based Pipeline Optimization Tool (TPOT) to make TPOT with Feature Extraction (TPOT-FE), an automated machine learning system that uses genetic programming (GP) to create ideal pipelines for a classification or regression task. These feature extractors enable TPOT-FE to build pipelines that can analyze non-tabular data, including text and images, which are increasingly common biomedical big data modalities that can contain rich quantities of information. We evaluate this approach on six image datasets and four text datasets, including three biomedical datasets, and show that TPOT-FE is able to consistently construct and optimize classification pipelines on all of the datasets.
更多
查看译文
关键词
Feature Extraction,Genetic Programming,Automated Machine Learning,Python
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要