A multi-label image classification method combining multi-stage image semantic information and label relevance

International Journal of Machine Learning and Cybernetics(2024)

引用 0|浏览3
Multi-label image classification (MLIC) is a fundamental and highly challenging task in the field of computer vision. Most methods usually only focus on the inter-label association or the way to extract image semantics, ignoring the relevance of labels at multiple semantic levels. To this end, we propose a new approach for multi-label image classification. Our method consists of a class activation mapping (CAM) module for multi-level semantic extraction of images and a graph convolutional network (GCN) module for label relevance construction. The CAM module follows the sequence of human visual perception of objects and segments the global image into multiple local images with target objects. Afterward, the segmented images are fused into the global stream to obtain global to local semantic information. The GCN module combines the label word embedding matrix and the co-occurrence matrix to map a matrix with label dependencies. Finally, the image features and the classifier are combined to obtain the final classification result. Extensive experiments on two benchmark datasets, i.e., VOC2007 and MS-COCO, show that our approach achieves better results on several generic evaluation indicators compared with state-of-the-art methods.
Multi-label classification,Label relevance,Graph convolutional network,Class activation mapping
AI 理解论文
Chat Paper