Bottleneck Transformers for Visual Recognition
CVPR, pp. 16519-16529, 2021.
We present BoTNet, a conceptually simple yet powerful backbone architecture that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. By just replacing the spatial convolutions with global self-attention in the final three bottleneck blocks of a ResNet...More
PPT (Upload PPT)