Exploring Sparse Visual Odometry Acceleration With High-Level Synthesis.

IEEE Access(2023)

引用 1|浏览9
暂无评分
摘要
Visual Odometry (VO) systems are widely used to determine the position and orientation of a robot or camera in an unknown environment. They are deployed on resource-constrained platforms, such as drones, and virtual reality or augmented reality headsets. VO systems harnessing modern System-on-Chip (SoCs) with integrated Field Programmable Gate Array (FPGA) have the potential to improve overall performance. This paper explores the FPGA acceleration of sparse semi-direct VO kernels using High-level Synthesis (HLS). The selected sparse Semi-direct VO (SVO) system, since its conception, was developed to execute efficiently on low-power processors. We show that both computational and data transfer overheads between the processing cores and the accelerators on the reconfigurable fabric need to be optimized to obtain better end-to-end performance. The additional data movement incurred when using an FPGA accelerator is due to the sparse computational nature together with random memory access patterns of the kernels. This paper shows that state-of-the-art HLS tools are not yet able to perform the required optimizations automatically. These tools usually target successful application kernels with dense computational patterns and regular memory access. In this paper we propose three, potentially general, methods to reduce the data transfer between the processing cores and the customised hardware kernels on the FPGA; these methods are: (a) approximation based on domain-specific knowledge, (b) lossless image compression, and (c) the use of on-the-fly computation. We present a case study of the use of these methods on SVO, a state-of-the-art sparse VO system with a semi-direct front-end. We demonstrate that our proposed methods can reduce data transfer overhead to achieve better end-to-end performance and that they can be applied not only when using standard Xilinx tools, but also with other state-of-the-art HLS tools, such as HeteroFlow. Compared to the baseline performance of the original SVO software on Arm processors, our proposed methods enable the Xilinx SDSoC and HeteroFlow designs to achieve a speedup of 2.4x and 2.14x, respectively, without noticeable accuracy loss. The Xilinx SDSoC and HeteroFlow designs also achieve a 1.85x and 1.89x improvement in energy efficiency, respectively, on a Xilinx Zynq Ultrascale+ SoC with Arm A53 cores and integrated FPGA. Compared to the SVO software baseline running on the Intel Xeon system, our proposed methods enable the Xilinx SDSoC and HeteroFlow designs to achieve 8.2x and 8.3x improvement in energy efficiency, respectively.
更多
查看译文
关键词
FPGA,high-level synthesis,performance optimization,pose estimation,visual odometry,Zynq,SLAM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要