Low-Resource Vision Challenges for Foundation Models
CoRR(2024)
摘要
Low-resource settings are well-established in natural language processing,
where many languages lack sufficient data for machine learning at scale.
However, low-resource problems are under-explored in computer vision. In this
paper, we strive to address this gap and explore the challenges of low-resource
image tasks with vision foundation models. Thus, we first collect a benchmark
of genuinely low-resource image data, covering historic maps, circuit diagrams,
and mechanical drawings. These low-resource settings all share the three
challenges of data scarcity, fine-grained differences, and the distribution
shift from natural images to the specialized domain of interest. While existing
foundation models have shown impressive generalizability, we find they cannot
transfer well to our low-resource tasks. To begin to tackle the challenges of
low-resource vision, we introduce one simple baseline per challenge.
Specifically, we propose to i) enlarge the data space by generative models, ii)
adopt the best sub-kernels to encode local regions for fine-grained difference
discovery and iii) learn attention for specialized domains. Experiments on the
three low-resource data sources in our benchmark demonstrate our proposals
already provide a better baseline than common transfer learning, data
augmentation, and fine-grained methods. This highlights the unique
characteristics and challenges of low-resource vision for foundation models
that warrant further investigation. Project website:
https://xiaobai1217.github.io/Low-Resource-Vision/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要