Biblio
Many lattice-based cryptosystems are based on the security of the Ring learning with errors (Ring-LWE) problem. The most critical and computationally intensive operation of these Ring-LWE based cryptosystems is polynomial multiplication. In this paper, we exploit the number theoretic transform to build a high-speed polynomial multiplier for the Ring-LWE based public key cryptosystems. We present a versatile pipelined polynomial multiplication architecture to calculate the product of two \$n\$-degree polynomials in about ((nlg n)/4 + n/2) clock cycles. In addition, we introduce several optimization techniques to reduce the required ROM storage. The experimental results on a Spartan-6 FPGA show that the proposed hardware architecture can achieve a speedup of on average 2.25 than the state of the art of high-speed design. Meanwhile, our design is able to save up to 47.06% memory blocks.