SkaNet: Split Kernel Attention Network

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT V(2023)

引用 0|浏览7
暂无评分
摘要
Recently, convolutional neural networks (CNNs) and vision transformers (ViTs) have shown impressive results in the area of light-weight models for edge devices. However, the dominant CNNs and ViTs architectures rely heavily on a structured grid or sequence representation of images, which can result in inflexible handling of complex or irregular objects within them. In this paper, we propose SkaNet, an innovative, high-performance hybrid architecture that synergistically integrates the benefits of both CNNs and ViTs, and further enhances these advantages by graph representation learning. Specifically in SkaNet, we introduce a novel linear attention named split kernel attention (SKA) that exploits graph convolution to capture global semantic information and facilitate flexible recognition of irregular objects, splits input tensors into multiple channel groups adaptively, and fuses aforementioned modules into linear attention to efficiently aggregate contextual information. Extensive experiments demonstrate that SkaNet outperforms popular light-weight CNN and ViT-based models on common vision tasks and datasets. For classification on ImageNet-1k, SkaNet-S, with 5.5M parameters, achieves an impressive top-1 accuracy of 79.5%, surpassing MobileViT-S with an absolute gain of 1.1%. Furthermore, SkaNet-S exhibits superior performance in semantic segmentation on PASCAL VOC 2012 and object detection on COCO 2017. Our source code is available on GitHub at: https://github.com/charryglomc/skanet.
更多
查看译文
关键词
Convolutional Neural Network,Attention,Graph Representation Learning,Light-weight
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要