GhostFormer: Efficiently amalgamated CNN-transformer architecture for object detection

PATTERN RECOGNITION(2024)

引用 0|浏览0
暂无评分
摘要
The lightweight network model has gradually evolved into an important research direction in object detection. Network lightweight design has a variety of research methods, such as quantization, knowledge distillation, and neural architecture search. However, these methods either fail to break through the performance bottleneck of the model itself or require massive training costs. In order to solve these problems, a new object detection model based on CNN-Transformer hybrid feature extraction network called GhostFormer is proposed from the perspective of lightweight network structure design. GhostFormer makes full use of the advantages of local modeling of CNN and global modeling of Transformer, not only effectively reducing the complexity of the convolution model but also breaking through the limitation of Transformer's lack of inductive bias. Finally, better transfer results are obtained in downstream tasks. Experiments show that the model is less than half as computationally expensive as YOLOv7 on the Pascal VOC dataset, with only about 3 % mAP@0.5 loss, and 9.7% mAP@0.5:0.95 improvement on the MS COCO dataset compared with GhostNet.
更多
查看译文
关键词
Object detection,Lightweight network design,Feature extraction,CNN-transformer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要