N. Zhang and Franz Franchetti (Proc. International Symposium on Code Generation and Optimization (CGO), 2025)
Code Generation for Cryptographic Kernels using Multi-word Modular Arithmetic on GPU
Preprint (2.8 MB)
Bibtex

Fully homomorphic encryption (FHE) and zero-knowledge proofs (ZKPs) are emerging as solutions for data security in distributed environments. However, the widespread adoption of these encryption techniques is hindered by their significant computational overhead, primarily resulting from core cryptographic operations that involve large integer arithmetic. This paper presents a formalization of multi-word modular arithmetic (MoMA), which breaks down large bitwidth integer arithmetic into operations on machine words. We further develop a rewrite system that implements MoMA through recursive rewriting of data types, designed for compatibility with compiler infrastructures and code generators. We evaluate MoMA by generating cryptographic kernels, including basic linear algebra subprogram (BLAS) operations and the number theoretic transform (NTT), targeting various GPUs. Our MoMA-based BLAS operations outperform stateof- the-art multi-precision libraries by orders of magnitude, and MoMA-based NTTs achieve near-ASIC performance on commodity GPUs.

Keywords:
Code generator, Number theoretic transforms, Cryptography, Multi-word arithmetic, Rewrite system, Modular, BLAS