GRAPHIC: Gather and Process Harmoniously in the Cache With High Parallelism and Flexibility

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING(2024)

Cited 0|Views22
No score
Abstract
In-memory computing (IMC) has been proposed to overcome the von Neumann bottleneck in data-intensive applications. However, existing IMC solutions could not achieve both high parallelism and high flexibility, which limits their application in more general scenarios: As a highly parallel IMC design, the functionality of a MAC crossbar is limited to the matrix-vector multiplication; Another IMC method of logic-in-memory (LiM) is more flexible in supporting different logic functions, but has low parallelism. To improve the LiM parallelism, we are inspired by investigating how the single-instruction, multiple-data (SIMD) instruction set in conventional CPU could potentially help to expand the number of LiM operands in one cycle. The biggest challenge is the inefficiency in handling non-continuous data in parallel due to the SIMD limitation of (i) continuous address, (ii) limited cache bandwidth, and (iii) large full-resolution parallel computing overheads. This article presents GRAPHIC, the first reported in-memory SIMD architecture that solves the parallelism and irregular data access challenges in applying SIMD to LiM. GRAPHIC exploits content-addressable memory (CAM) and row-wise-accessible SRAM. By providing the in-situ, full-parallelism, and low-overhead operations of address search, cache read-compute-and-update, GRAPHIC accomplishes high-efficiency gather and aggregation with high parallelism, high energy efficiency, low latency, and low area overheads. Experiments in both continuous data access and irregular data pattern applications show an average speedup of 5x over iso-area AVX-like LiM, and 3-5x over the emerging CAM-based accelerators of CAPE and GaaS-X in advanced techniques.
More
Translated text
Key words
Associative memory,single instruction multiple data,FAST SRAM,in-memory computing,logic in memory
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined