Ara: A 1 GHz+ Scalable and Energy-Efficient RISC-V Vector Processor with Multi-Precision Floating Point Support in 22 nm FD-SOI.

arXiv: Hardware Architecture(2019)

引用 25|浏览0
暂无评分
摘要
In this paper, we present Ara, a 64-bit vector processor based on the version 0.5 draft of RISC-V's vector extension, implemented in GlobalFoundries 22FDX FD-SOI technology. Ara's microarchitecture is scalable, as it is composed of a set of identical lanes, each containing part of the processor's vector register file and functional units. It achieves up to 97% FPU utilization when running a 256 x 256 double precision matrix multiplication on sixteen lanes. Ara runs at 1.2 GHz in the typical corner (TT/0.80 V/25 oC), achieving a performance up to 34 DP-GFLOPS. In terms of energy efficiency, Ara achieves up to 67 DP-GFLOPS/W under the same conditions, which is 56% higher than similar vector processors found in literature. An analysis on several vectorizable linear algebra computation kernels for a range of different matrix and vector sizes gives insight into performance limitations and bottlenecks for vector processors and outlines directions to maintain high energy efficiency even for small matrix sizes where the vector architecture achieves suboptimal utilization of the available FPUs.
更多
查看译文
关键词
Vector processors,Parallel processing,Instruction sets,Registers,Multicore processing,Open source software
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要