Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

Boris Knyazev,Doha Hwang,Simon Lacoste-Julien

arxiv（2023）

引用 5|浏览39

暂无评分

摘要

Pretraining a neural network on a large dataset is becoming a cornerstone in machine learning that is within the reach of only a few communities with large-resources. We aim at an ambitious goal of democratizing pretraining. Towards that goal, we train and release a single neural network that can predict high quality ImageNet parameters of other neural networks. By using predicted parameters for initialization we are able to boost training of diverse ImageNet models available in PyTorch. When transferred to other datasets, models initialized with predicted parameters also converge faster and reach competitive final performance.

查看译文

关键词

diverse imagenet

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要