A 40-nm 91-mW, 90-fps Learning-Based Full HD Super-Resolution Accelerator

IEEE Journal of Solid-State Circuits(2023)

引用 0|浏览11
暂无评分
摘要
Super-resolution has been utilized in a plenty of applications to provide better visual experience. To meet the high-throughput and low-power needs, some dedicated accelerators for super-resolution have been proposed. Neural-network (NN)-based super-resolution accelerators achieve impressive restoration performance, but the high-computational complexity does not allow a high throughput for video streaming. This work presents a super-resolution accelerator that implements the rapid and accurate image super-resolution (RAISR) algorithm for reconstructing super-resolution images. The utilization of the low-resolution (LR) upscaler is increased by 50% by the proposed memory scheduling scheme. Kernel compression is utilized to reduce the overall on-chip memory by 72%. A patch reuse scheme achieves a 91% reduction in external memory access times compared to the direct-mapped design. The architecture is flexible to reconstruct full HD images with a variety of upscaling factors ( $2\times $ , $3\times $ , $4\times $ ). Fabricated in a 40-nm CMOS technology, the proposed super-resolution accelerator integrates 3.11-M gates in a core area of 3.33 mm2. The chip is able to deliver a throughput of 90 frame/s (fps) for all supported upscaling factors and dissipates 91 mW at 200 MHz. Compared with the state-of-the-art designs, this work achieves a 5.4-to- $28.4\times $ higher normalized throughput with 5.1-to- $36\times $ lower normalized energy dissipation.
更多
查看译文
关键词
CMOS integrated circuits,energy-efficient architecture,hardware accelerator,machine learning,super-resolution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要