Visible to the public Biblio

Filters: Keyword is polynomial multiplication  [Clear All Filters]
2022-02-07
Qin, Zhenhui, Tong, Rui, Wu, Xingjun, Bai, Guoqiang, Wu, Liji, Su, Linlin.  2021.  A Compact Full Hardware Implementation of PQC Algorithm NTRU. 2021 International Conference on Communications, Information System and Computer Engineering (CISCE). :792–797.
With the emergence and development of quantum computers, the traditional public-key cryptography (PKC) is facing the risk of being cracked. In order to resist quantum attacks and ensure long-term communication security, NIST launched a global collection of Post Quantum Cryptography (PQC) standards in 2016, and it is currently in the third round of selection. There are three Lattice-based PKC algorithms that stand out, and NTRU is one of them. In this article, we proposed the first complete and compact full hardware implementation of NTRU algorithm submitted in the third round. By using one structure to complete the design of the three types of complex polynomial multiplications in the algorithm, we achieved better performance while reducing area costs.
2021-03-15
Khuchit, U., Wu, L., Zhang, X., Yin, Y., Batsukh, A., Mongolyn, B., Chinbat, M..  2020.  Hardware Design of Polynomial Multiplication for Byte-Level Ring-LWE Based Cryptosystem. 2020 IEEE 14th International Conference on Anti-counterfeiting, Security, and Identification (ASID). :86–89.
An ideal lattice is defined over a ring learning with errors (Ring-LWE) problem. Polynomial multiplication over the ring is the most computational and time-consuming block in lattice-based cryptography. This paper presents the first hardware design of the polynomial multiplication for LAC, one of the Round-2 candidates of the NIST PQC Standardization Process, which has byte-level modulus p=251. The proposed architecture supports polynomial multiplications for different degree n (n=512/1024/2048). For designing the scheme, we used the Vivado HLS compiler, a high-level synthesis based hardware design methodology, which is able to optimize software algorithms into actual hardware products. The design of the scheme takes 274/280/291 FFs and 204/217/208 LUTs on the Xilinx Artix-7 family FPGA, requested by NIST PQC competition for hardware implementation. Multiplication core uses only 1/1/2 pieces of 18Kb BRAMs, 1/1/1 DSPs, and 90/94/95 slices on the board. Our timing result achieved in an alternative degree n with 5.052/4.3985/5.133ns.
2017-07-24
Du, Chaohui, Bai, Guoqiang, Wu, Xingjun.  2016.  High-Speed Polynomial Multiplier Architecture for Ring-LWE Based Public Key Cryptosystems. Proceedings of the 26th Edition on Great Lakes Symposium on VLSI. :9–14.

Many lattice-based cryptosystems are based on the security of the Ring learning with errors (Ring-LWE) problem. The most critical and computationally intensive operation of these Ring-LWE based cryptosystems is polynomial multiplication. In this paper, we exploit the number theoretic transform to build a high-speed polynomial multiplier for the Ring-LWE based public key cryptosystems. We present a versatile pipelined polynomial multiplication architecture to calculate the product of two \$n\$-degree polynomials in about ((nlg n)/4 + n/2) clock cycles. In addition, we introduce several optimization techniques to reduce the required ROM storage. The experimental results on a Spartan-6 FPGA show that the proposed hardware architecture can achieve a speedup of on average 2.25 than the state of the art of high-speed design. Meanwhile, our design is able to save up to 47.06% memory blocks.