AttrSeg: Open-Vocabulary Semantic Segmentation via Attribute Decomposition-Aggregation
arXiv (Cornell University)(2023)
摘要
Open-vocabulary semantic segmentation is a challenging task that requires
segmenting novel object categories at inference time. Recent studies have
explored vision-language pre-training to handle this task, but suffer from
unrealistic assumptions in practical scenarios, i.e., low-quality textual
category names. For example, this paradigm assumes that new textual categories
will be accurately and completely provided, and exist in lexicons during
pre-training. However, exceptions often happen when encountering ambiguity for
brief or incomplete names, new words that are not present in the pre-trained
lexicons, and difficult-to-describe categories for users. To address these
issues, this work proposes a novel attribute decomposition-aggregation
framework, AttrSeg, inspired by human cognition in understanding new concepts.
Specifically, in the decomposition stage, we decouple class names into diverse
attribute descriptions to complement semantic contexts from multiple
perspectives. Two attribute construction strategies are designed: using large
language models for common categories, and involving manually labeling for
human-invented categories. In the aggregation stage, we group diverse
attributes into an integrated global description, to form a discriminative
classifier that distinguishes the target object from others. One hierarchical
aggregation architecture is further proposed to achieve multi-level
aggregations, leveraging the meticulously designed clustering module. The final
results are obtained by computing the similarity between aggregated attributes
and images embeddings. To evaluate the effectiveness, we annotate three types
of datasets with attribute descriptions, and conduct extensive experiments and
ablation studies. The results show the superior performance of attribute
decomposition-aggregation.
更多查看译文
关键词
segmentation,attribute,open-vocabulary,decomposition-aggregation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要