Vision HGNN: An Image is More than a Graph of Nodes.

ICCV(2023)

引用 2|浏览33
暂无评分
摘要
The realm of graph-based modeling has proven its adaptability across diverse real-world data types. However, its applicability to general computer vision tasks had been limited until the introduction of the Vision Graph Neural Network (ViG). ViG divides input images into patches, conceptualized as nodes, constructing a graph through connections to nearest neighbors. Nonetheless, this method of graph construction confines itself to simple pairwise relationships, leading to surplus edges and unwarranted memory and computation expenses. In this paper, we enhance ViG by transcending conventional "pairwise" linkages and harnessing the power of the hypergraph to encapsulate image information. Our objective is to encompass more intricate inter-patch associations. In both training and inference phases, we adeptly establish and update the hypergraph structure using the Fuzzy C-Means method, ensuring minimal computational burden. This augmentation yields the Vision HyperGraph Neural Network (ViHGNN). The model’s efficacy is empirically substantiated through its state-of-the-art performance on both image classification and object detection tasks, courtesy of the hypergraph structure learning module that uncovers higher-order relationships. Our code is available at: https://github.com/VITA-Group/ViHGNN.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要