Emerging Property of Masked Token for Effective Pre-training
arxiv(2024)
摘要
Driven by the success of Masked Language Modeling (MLM), the realm of
self-supervised learning for computer vision has been invigorated by the
central role of Masked Image Modeling (MIM) in driving recent breakthroughs.
Notwithstanding the achievements of MIM across various downstream tasks, its
overall efficiency is occasionally hampered by the lengthy duration of the
pre-training phase. This paper presents a perspective that the optimization of
masked tokens as a means of addressing the prevailing issue. Initially, we
delve into an exploration of the inherent properties that a masked token ought
to possess. Within the properties, we principally dedicated to articulating and
emphasizing the `data singularity' attribute inherent in masked tokens. Through
a comprehensive analysis of the heterogeneity between masked tokens and visible
tokens within pre-trained models, we propose a novel approach termed masked
token optimization (MTO), specifically designed to improve model efficiency
through weight recalibration and the enhancement of the key property of masked
tokens. The proposed method serves as an adaptable solution that seamlessly
integrates into any MIM approach that leverages masked tokens. As a result, MTO
achieves a considerable improvement in pre-training efficiency, resulting in an
approximately 50
performance of the recent approaches.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要