Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior
NeurIPS(2023)
摘要
Pre-trained machine learning (ML) models have shown great performance for a
wide range of applications, in particular in natural language processing (NLP)
and computer vision (CV). Here, we study how pre-training could be used for
scientific machine learning (SciML) applications, specifically in the context
of transfer learning. We study the transfer behavior of these models as (i) the
pre-trained model size is scaled, (ii) the downstream training dataset size is
scaled, (iii) the physics parameters are systematically pushed out of
distribution, and (iv) how a single model pre-trained on a mixture of different
physics problems can be adapted to various downstream applications. We find
that-when fine-tuned appropriately-transfer learning can help reach desired
accuracy levels with orders of magnitude fewer downstream examples (across
different tasks that can even be out-of-distribution) than training from
scratch, with consistent behavior across a wide range of downstream examples.
We also find that fine-tuning these models yields more performance gains as
model size increases, compared to training from scratch on new downstream
tasks. These results hold for a broad range of PDE learning tasks. All in all,
our results demonstrate the potential of the "pre-train and fine-tune" paradigm
for SciML problems, demonstrating a path towards building SciML foundation
models. We open-source our code for reproducibility.
更多查看译文
关键词
scientific machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要