SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation
arxiv(2023)
摘要
In-context segmentation aims at segmenting novel images using a few labeled
example images, termed as "in-context examples", exploring content similarities
between examples and the target. The resulting models can be generalized
seamlessly to novel segmentation tasks, significantly reducing the labeling and
training costs compared with conventional pipelines. However, in-context
segmentation is more challenging than classic ones requiring the model to learn
segmentation rules conditioned on a few samples. Unlike previous work with
ad-hoc or non-end-to-end designs, we propose SEGIC, an end-to-end
segment-in-context framework built upon a single vision foundation model (VFM).
In particular, SEGIC leverages the emergent correspondence within VFM to
capture dense relationships between target images and in-context samples. As
such, information from in-context samples is then extracted into three types of
instructions, i.e. geometric, visual, and meta instructions, serving as
explicit conditions for the final mask prediction. SEGIC is a straightforward
yet effective approach that yields state-of-the-art performance on one-shot
segmentation benchmarks. Notably, SEGIC can be easily generalized to diverse
tasks, including video object segmentation and open-vocabulary segmentation.
Code will be available at https://github.com/MengLcool/SEGIC.
更多查看译文
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要