Fast Image Matching Based on Channel Attention and Feature Slicing

Gai Shaoyan, Huang Yanyan,Da Feipeng

ACTA OPTICA SINICA(2023)

引用 0|浏览2
暂无评分
摘要
Objective Image matching is the process of finding spatial alignment relationships between identical or similar target objects in multiple images. Image matching is one of the research hotspots in the field of optical measurement, which is widely used in image mosaic, product optical measurement, video anti-shake, iterative reconstruction, and other fields. In current research work, accelerating image matching mainly starts from two aspects: accelerating feature point detection speed and accelerating feature matching speed. The most representative method for accelerating feature point detection is the oriented fast and rotated brief (ORB) operator, which detects feature points by comparing the pixel differences between different points. The operation process is relatively simple, and the processing speed is fast. In addition, some algorithms speed up image matching by filtering feature points. Some algorithms reduce image matching time by accelerating the speed of feature matching. Ye et al. designed a compact discriminative binary descriptor (CDbin) to obtain binary feature descriptors with smaller number of training parameters. The existing image matching algorithms tend to focus on matching accuracy while neglecting the decrease in matching speed to some extent. In order to solve this problem, an algorithm is proposed, which outperforms most other binary matching algorithms in terms of matching performance and time. Methods The floating-point feature descriptors output by the feature description algorithm are converted into binary feature descriptors to reduce the computational complexity during feature matching and thus reduce feature matching time. This article is inspired by the classical methods. Gu et al. modified the AlexNet structure based on the depth convolutional neural network and mapped the output descriptive sub element value to - 1 or 1. Yang et al. used multi-bit binary descriptors to describe image blocks, reducing information loss caused by directly converting real-valued floating-point descriptors into binary descriptors. Soleimani et al. proposed a cyclic shift binary descriptor, which reduced the number of parameters used for calculating descriptors and thus improved matching speed. This article is inspired by the feature representation deep neural network SFLHC, which uses Sigmoid functions and segmented threshold functions separately for each element, binarizes them, and combines the idea of channel attention to improve the binary feature description network. A channel attention and feature slicing description network (CAFSD) is designed, which is combined with the fast feature point detection algorithm, namely ORB. Furthermore, a fast image matching algorithm based on channel attention mechanism and feature slicing is proposed, which can significantly improve the matching speed of images while improving the accuracy of binary description. In addition, based on the triplet loss function, the quantization loss function, uniform distribution loss function, and correlation loss function are introduced to form a composite loss function to optimize network training, further reducing the error of converting floating point descriptors to binary descriptors. Results and Discussions The core CAFSD feature description network of this algorithm is trained and tested using the UBC Phototour dataset. The UBC Phototour dataset includes three sub datasets: Liberty, NotreDame, and Yosemite. Usually, one dataset is used to train the network, while the other two datasets are used to test the network, and the average value is taken as the final result. In addition, for the testing of the entire image matching algorithm ( Fig. 1), 20 circuit board data are used. Commonly used evaluation indicators include FPR95, matching accuracy, matching score, and matching time. FPR95 is an evaluation indicator used in the UBC Phototour dataset to measure the quality of feature description algorithms. Other indicators are used for the overall testing of image matching algorithms, where matching time refers to the time taken by the algorithm from feature detection to the end of feature matching. The results are shown in Table 1. Loss functions of L-T and L-Q, L-T and L-E, as well as L-T and L-C are used to participate in CAFSD training, and UBC Phototour dataset is used for testing. The optimal coefficients for L-Q, L-E, and L-C results are 1, 0. 5, and 0. 5. In combination with the ORB detection algorithm, the CAFSD has been developed. It can be seen that the speed and accuracy of matching images can be improved obviously. Conclusions This article proposes a fast image matching algorithm based on channel attention and feature slicing, with the core of the algorithm being CAFSD. Compact discriminative binary descriptor obtains binary feature descriptors with smaller number of training parameters. The existing image matching algorithms tend to focus on matching accuracy while neglecting the decrease in matching speed to some extent. In order to solve this problem, the CAFSD algorithm is proposed in this article. The inspiration for this article comes from the special power of the SFLHC deep neural network, which uses binary Sigmoid functions and piecewise finite functions for each element and combines the idea of channel focusing to improve the binary part network for complex recognition of binary scenes. In combination with the ORB detection algorithm, the CAFSD has been developed. In addition, based on channel attention and additional offloading, a fast image algorithm has been presented and proved.
更多
查看译文
关键词
image processing,image matching,attention,feature description
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要