Inferring Latent Class Statistics from Text for Robust Visual Few-Shot Learning.
CoRR(2023)
摘要
In the realm of few-shot learning, foundation models like CLIP have proven
effective but exhibit limitations in cross-domain robustness especially in
few-shot settings. Recent works add text as an extra modality to enhance the
performance of these models. Most of these approaches treat text as an
auxiliary modality without fully exploring its potential to elucidate the
underlying class visual features distribution. In this paper, we present a
novel approach that leverages text-derived statistics to predict the mean and
covariance of the visual feature distribution for each class. This predictive
framework enriches the latent space, yielding more robust and generalizable
few-shot learning models. We demonstrate the efficacy of incorporating both
mean and covariance statistics in improving few-shot classification performance
across various datasets. Our method shows that we can use text to predict the
mean and covariance of the distribution offering promising improvements in
few-shot learning scenarios.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要