Does Dataset Lottery Ticket Hypothesis Exist?

ICLR 2023(2023)

引用 0|浏览81
暂无评分
摘要
Tuning hyperparameters and exploring the suitable training schemes for the self-supervised models are usually expensive and resource-consuming, especially on large-scale datasets like ImageNet-1K. Critically, this means only a few establishments (e.g., Google, Meta, etc.) have the ability to afford the heavy experiments on this task, which seriously hinders more engagement and better development of this area. An ideal situation is that there exists a subset from the full large-scale dataset, the subset can correctly reflect the performance distinction when performing different training frameworks, hyper-parameters, etc. This new training manner will substantially decrease resource requirements and improve the computational performance of ablations without compromising accuracy on the full dataset. We formulate this interesting problem as the dataset lottery ticket hypothesis and the target subsets as the winning tickets. In this work, we analyze this problem through finding out partial empirical data on the class dimension that has a consistent {\em Empirical Risk Trend} as the full observed dataset. We also examine multiple solutions, including (i) a uniform selection scheme that has been widely used in literature; (ii) subsets by involving prior knowledge, for instance, using the sorted per-class performance of the strong supervised model to identify the desired subset, WordNet Tree on hierarchical semantic classes, etc., for generating the target winning tickets. We verify this hypothesis on the self-supervised learning task across a variety of recent mainstream methods, such as MAE, DINO, MoCo-V1/V2, etc., with different backbones like ResNet and Vision Transformers. The supervised classification task is also examined as an extension. We conduct extensive experiments for training more than 2K self-supervised models on the large-scale ImageNet-1K and its subsets by 1.5M GPU hours, to scrupulously deliver our discoveries and demonstrate our conclusions. According to our experimental results, the winning tickets (subsets) that we find behave consistently to the original dataset, which generally can benefit many experimental studies and ablations, saving 10x of training time and resources for the hyperparameter tuning and other ablation studies.
更多
查看译文
关键词
Dataset Lottery Ticket Hypothesis,Self-supervised Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要