Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models
CoRR(2023)
摘要
Public release of the weights of pretrained foundation models, otherwise
known as downloadable access \citep{solaiman_gradient_2023}, enables
fine-tuning without the prohibitive expense of pretraining. Our work argues
that increasingly accessible fine-tuning of downloadable models may increase
hazards. First, we highlight research to improve the accessibility of
fine-tuning. We split our discussion into research that A) reduces the
computational cost of fine-tuning and B) improves the ability to share that
cost across more actors. Second, we argue that increasingly accessible
fine-tuning methods may increase hazard through facilitating malicious use and
making oversight of models with potentially dangerous capabilities more
difficult. Third, we discuss potential mitigatory measures, as well as benefits
of more accessible fine-tuning. Given substantial remaining uncertainty about
hazards, we conclude by emphasizing the urgent need for the development of
mitigations.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要