RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner
CoRR(2024)
摘要
Referring expression segmentation (RES), a task that involves localizing
specific instance-level objects based on free-form linguistic descriptions, has
emerged as a crucial frontier in human-AI interaction. It demands an intricate
understanding of both visual and textual contexts and often requires extensive
training data. This paper introduces RESMatch, the first semi-supervised
learning (SSL) approach for RES, aimed at reducing reliance on exhaustive data
annotation. Extensive validation on multiple RES datasets demonstrates that
RESMatch significantly outperforms baseline approaches, establishing a new
state-of-the-art. Although existing SSL techniques are effective in image
segmentation, we find that they fall short in RES. Facing the challenges
including the comprehension of free-form linguistic descriptions and the
variability in object attributes, RESMatch introduces a trifecta of
adaptations: revised strong perturbation, text augmentation, and adjustments
for pseudo-label quality and strong-weak supervision. This pioneering work lays
the groundwork for future research in semi-supervised learning for referring
expression segmentation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要