Unknown Prompt, the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization
arxiv(2024)
摘要
We delve into Open Domain Generalization (ODG), marked by domain and category
shifts between training's labeled source and testing's unlabeled target
domains. Existing solutions to ODG face limitations due to constrained
generalizations of traditional CNN backbones and errors in detecting target
open samples in the absence of prior knowledge. Addressing these pitfalls, we
introduce ODG-CLIP, harnessing the semantic prowess of the vision-language
model, CLIP. Our framework brings forth three primary innovations: Firstly,
distinct from prevailing paradigms, we conceptualize ODG as a multi-class
classification challenge encompassing both known and novel categories. Central
to our approach is modeling a unique prompt tailored for detecting unknown
class samples, and to train this, we employ a readily accessible stable
diffusion model, elegantly generating proxy images for the open class.
Secondly, aiming for domain-tailored classification (prompt) weights while
ensuring a balance of precision and simplicity, we devise a novel visual
stylecentric prompt learning mechanism. Finally, we infuse images with
class-discriminative knowledge derived from the prompt space to augment the
fidelity of CLIP's visual embeddings. We introduce a novel objective to
safeguard the continuity of this infused semantic intel across domains,
especially for the shared classes. Through rigorous testing on diverse
datasets, covering closed and open-set DG contexts, ODG-CLIP demonstrates clear
supremacy, consistently outpacing peers with performance boosts between 8
Code will be available at https://github.com/mainaksingha01/ODG-CLIP.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要