ReApprox-PIM: Reconfigurable Approximate Look-Up-Table (LUT)-Based Processing-in-Memory (PIM) Machine Learning Accelerator

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(2024)

引用 0|浏览0
暂无评分
摘要
Convolutional neural networks (CNNs) have achieved significant success in various applications. Numerous hardware accelerators are introduced to accelerate CNN execution with improved energy efficiency compared to traditional software implementations. Despite the achieved success, deploying traditional hardware accelerators for bulky CNNs on current and emerging smart devices is impeded by limited resources, including memory, power, area, and computational capabilities. Recent works introduced processing-in-memory (PIM), a non-Von-Neumann architecture, which is a promising approach to tackle the problem of data movement between logic and memory blocks. However, as observed from the literature, the existing PIM architectures cannot congregate all the computational operations due to limited programmability and flexibility. Furthermore, the capabilities of the PIM are challenged by the limited available on-chip memory. To enable faster computations and address the limited on-chip memory constraints, this work introduces a novel reconfigurable approximate computing-based PIM, termed ReApprox-PIM. The proposed ReApprox-PIM is capable of addressing the two challenges mentioned above in the following manner: (i) it utilizes a programmable look-up-table (LUT)-based processing architecture that can support different approximate computing techniques via programmability, and (ii) followed by resource-efficient, fast CNN computing via the implementation of highly-optimized approximate computing techniques. This results in improved computing footprint, operational parallelism, and reduced computational latency and power consumption compared to prior PIMs relying on exact computations for CNN inference acceleration at a minimal sacrifice of accuracy. We have evaluated the proposed ReApprox-PIM on various CNN architectures, for inference applications including standard LeNet, AlexNet, ResNet-18, -34, and -50. Our experimental results show that the ReApprox-PIM achieves a speedup of 1.63× with 1.66 × lower area for the processing components compared to the existing PIM architectures. Furthermore, the proposed ReApprox-PIM achieves 2.5× higher energy efficiency and 1.3× higher throughput compared to the state-of-the-art LUT-based PIM architectures.
更多
查看译文
关键词
Processing-in-Memory,Approximate Computing,Convolutional Neural Network,Look-up-Table
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要