Analysis of Meta-Learning Approaches for TCGA Pan-cancer Datasets

2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)(2020)

引用 1|浏览11
暂无评分
摘要
Cancer has been characterized as a heterogeneous disease, and the classification of cancer subtypes has become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients and provide clinical decision support for clinicians. With the advance of machine learning in the last decade, many researchers employ machine learning to tackle the cancer classification problem. Importantly, traditional machine learning algorithms require a large amount of annotated data for model training. However, collection of large amounts of annotated data is time-consuming and expensive and may not be realistic in real-world activities. Facing data scarcity, meta-learning is proposed to tackle this problem. Meta-learning utilizes prior knowledge learned from related tasks and generalizes to new tasks of limited supervised experience, and it has been applied in many fields to tackle scarce annotated data problem, such as few-shot image classification, drug discovery, etc. As data scarcity is common in cancer research and diagnosis studies, and there are only few previous studies that classify cancers based on limited annotated data. We explore the meta-learning algorithm (MAML) to tackle the scenario where only limited annotated data are available. In this work, our objective is to comprehensively compare MAML among few-shot learning methods (matching network and prototypical network) and traditional machine learning methods (random forest and K-nearest neighbor). Experimental results on The Cancer Genome Atlas (TCGA) cancer patient data demonstrates the effectiveness and superiority of MAML over other methods, including its ability to outperform the other methods using 4.5-fold fewer features.
更多
查看译文
关键词
Meta-Learning,cancer genomics,cancer proteomics,pan-cancer analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要