Transformer-based Convolution-free Visual Place Recognition

Anna Urban,Bogdan Kwolek

2022 17th International Conference on Control, Automation, Robotics and Vision (ICARCV)(2022)

引用 0|浏览1
暂无评分
摘要
Over the last decade, convolutional neural networks have been a core element in the recent remarkable advances in machine learning, computer vision, and robotics. Vision transformers have recently demonstrated great success in various computer vision tasks, motivating a tremendously increased interest in their deployment into many real-world vision applications. However, until now, the number of successful applications of transformers in robotics is somewhat limited. This work presents an approach to visual place recognition using a vision transformer (ViT). ViT trained from scratch, and two pretrained ViTs in base and large versions have been finetuned on a target dataset. The features extracted by transformers have then been used in place recognition using a k-NN. Finally, contrastive learning has been performed to embed the features and improve recognition performance. The algorithm has been evaluated in a dataset for indoor place recognition comprising images with 6-DOF viewpoint variations. Experimental results demonstrate that considerable gain in recognition accuracy can be obtained by finetuned transformers in comparison to results achieved by CNNs.
更多
查看译文
关键词
place recognition,transformer-based,convolution-free
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要