More Context, Less Distraction: Zero-shot Visual Classification by Inferring and Conditioning on Contextual Attributes
ICLR 2024(2024)
Key words
vision-language model,CLIP,zero-shot classification,human perception,contextual attributes,spurious feature
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined