Fast Neon-Based Multiplication For Lattice-Based Nist Post-Quantum Cryptography Finalists

POST-QUANTUM CRYPTOGRAPHY, PQCRYPTO 2021(2021)

引用 4|浏览2
暂无评分
摘要
This paper focuses on high-speed NEON-based constant-time implementations of multiplication of large polynomials in the NIST PQC KEM Finalists: NTRU, Saber, and CRYSTALS-Kyber. We use the Number Theoretic Transform (NTT)-based multiplication in Kyber, the Toom-Cook algorithm in NTRU, and both types of multiplication in Saber. Following these algorithms and using Apple M1, we improve the decapsulation performance of the NTRU, Kyber, and Saber-based KEMs at the security level 3 by the factors of 8.4, 3.0, and 1.6, respectively, compared to the reference implementations. On Cortex-A72, we achieve the speed-ups by factors varying between 5.7 and 7.5x for the Forward/Inverse NTT in Kyber, and between 6.0 and 7.8x for ToomCook in NTRU, over the best existing implementations in pure C. For Saber, when using NEON instructions on Cortex-A72, the implementation based on NTT outperforms the implementation based on the ToomCook algorithm by 14% in the case of the MatrixVectorMul function but is slower by 21% in the case of the InnerProduct function. Taking into account that in Saber, keys are not available in the NTT domain, the overall performance of the NTT-based version is very close to the performance of the Toom-Cook version. The differences for the entire decapsulation at the three major security levels (1, 3, and 5) are -4, -2, and +2%, respectively. Our benchmarking results demonstrate that our NEON-based implementations run on an Apple M1 ARM processor are comparable to those obtained using the best AVX2-based implementations run on an AMD EPYC 7742 processor. Our work is the first NEON-based ARMv8 implementation of each of the three NIST PQC KEM finalists.
更多
查看译文
关键词
ARMv8, NEON, Karatsuba, Toom-Cook, Number theoretic transform, NTRU, Saber, Kyber, Latice, Post-quantum cryptography
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要