Accurate Fine-Grained Object Recognition with Structure-Driven Relation Graph Networks

INTERNATIONAL JOURNAL OF COMPUTER VISION(2024)

Cited 12|Views56
No score
Abstract
Fine-grained object recognition (FGOR) aims to learn discriminative features that can identify the subtle distinctions between visually similar objects. However, less effort has been devoted to overcoming the impact of object's personalized differences, e.g., varying posture or perspective. We argue that the personalized differences could decline the network's perception of discriminative features, thus discarding some discriminative clues and degrading the FGOR performance accordingly. This motivates us to explore the intrinsic structure knowledge: the fixed spatial correlation between object parts, and thus apply this knowledge to associate diverse semantic parts and recover the missing discriminative details caused by the personalized differences accordingly. In this paper, we propose an end-to-end Structure-driven Relation Graph Network (SRGN) for fine-grained object recognition, and target at exploring and exploiting the object structure information without any additional annotations to associate diverse semantic parts, making the network sensitive to discriminative details influenced by personalized differences. Specifically, the core of SRGN is a Structure-aware Axial Graph (SAG) module, which first infers the structure embedding by establishing the correlation between position information and visual features along the axial direction, and then applies this embedding as aggregation weights to emphasize each discriminative representation by weighted reassembling all relevant features to it. Additionally, our SAG can be readily extensible to a multi-graph schema, that leverages the complementary advantages of different structure embeddings between the position information and visual content, further improving SAG. In this way, our SRGN can demonstrate remarkable robustness in scenarios characterized by extreme distribution perturbations, ultimately leading to superior performance. Extensive experiments and explainable visualizations validate the efficacy of the proposed approach on widely-used fine-grained benchmarks.
More
Translated text
Key words
Fine-grained object recognition,Structure-driven relation graph networks,Graph convolution network,Structure knowledge
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined