Bandwidth Efficient Homomorphic Encrypted Matrix Vector Multiplication Accelerator on FPGA

2022 International Conference on Field-Programmable Technology (ICFPT)(2022)

引用 2|浏览35
暂无评分
摘要
Homomorphic Encryption (HE) is a promising solution to the increasing concerns of privacy in Machine Learning (ML) as it enables computations directly on encrypted data. However, it imposes significant overhead on the compute system and remains impractically slow. Prior works have proposed efficient FPGA implementations of basic HE primitives such as number theoretic transform (NTT), key switching, etc. Composing the primitives together to realize higher level ML computation is still a challenge due to the large data transfer overhead. In this work, we propose an efficient FPGA implementation of HE Matrix Vector Multiplication $(\mathbf{M}\times \mathbf{V})$ , a key kernel in HE-based Machine Learning applications. By analyzing the data reuse characteristics and the encryption overhead of HE $\mathbf{M}\times \mathbf{V}$ , we show that simply using the principles of unencrypted $\mathbf{M}\times \mathbf{V}$ to design accelerators for HE $\mathbf{M}\times \mathbf{V}$ can lead to a significant amount of DRAM data transfers. We tackle the computation and data transfer challenges by proposing a bandwidth efficient dataflow that is specially optimized for HE $\mathbf{M}\times \mathbf{V}$ . We identify highly reused data entities in HE $\mathbf{M}\times \mathbf{V}$ and efficiently utilize the on-chip SRAM to reduce the DRAM data transfers. To speed up the computation of HE $\mathbf{M}\times \mathbf{V}$ , we exploit three types of parallelism: partial sum parallelism, residual polynomial parallelism and coefficient parallelism. Leveraging these innovations, we demonstrate the first FPGA accelerator for HE matrix vector multiplication. Evaluation on 7 HE $\mathbf{M}\times \mathbf{V}$ benchmarks shows that our FPGA accelerator is up to $3.8\times$ (GeoMean $2.8\times$ ) faster compared to the 64-thread CPU implementation.
更多
查看译文
关键词
FPGA acceleration,homomorphic encryption,matrix vector multiplication,parallel computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要