A multi-label image classification method combining multi-stage image semantic information and label relevance

International Journal of Machine Learning and Cybernetics(2024)

引用 0|浏览3
暂无评分
摘要
Multi-label image classification (MLIC) is a fundamental and highly challenging task in the field of computer vision. Most methods usually only focus on the inter-label association or the way to extract image semantics, ignoring the relevance of labels at multiple semantic levels. To this end, we propose a new approach for multi-label image classification. Our method consists of a class activation mapping (CAM) module for multi-level semantic extraction of images and a graph convolutional network (GCN) module for label relevance construction. The CAM module follows the sequence of human visual perception of objects and segments the global image into multiple local images with target objects. Afterward, the segmented images are fused into the global stream to obtain global to local semantic information. The GCN module combines the label word embedding matrix and the co-occurrence matrix to map a matrix with label dependencies. Finally, the image features and the classifier are combined to obtain the final classification result. Extensive experiments on two benchmark datasets, i.e., VOC2007 and MS-COCO, show that our approach achieves better results on several generic evaluation indicators compared with state-of-the-art methods.
更多
查看译文
关键词
Multi-label classification,Label relevance,Graph convolutional network,Class activation mapping
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要