CRARApr 6

GPU Acceleration of TFHE-Based High-Precision Nonlinear Layers for Encrypted LLM Inference

arXiv:2604.0478347.1
AI Analysis

This work addresses privacy concerns in cloud-based LLM services by enabling practical encrypted inference, though it is incremental as it builds on existing TFHE methods with GPU optimizations.

The paper tackles the problem of efficient and high-precision nonlinear function evaluation for encrypted LLM inference using Fully Homomorphic Encryption (FHE), specifically addressing limitations in TFHE-based methods. It proposes TIGER, a GPU-accelerated framework that achieves speedups of 7.17×, 16.68×, and 17.05× over a CPU baseline for key nonlinear layers like GELU, Softmax, and LayerNorm.

Deploying large language models (LLMs) as cloud services raises privacy concerns as inference may leak sensitive data. Fully Homomorphic Encryption (FHE) allows computation on encrypted data, but current FHE methods struggle with efficient and precise nonlinear function evaluation. Specifically, CKKS-based approaches require high-degree polynomial approximations, which are costly when target precision increases. Alternatively, TFHE's Programmable Bootstrapping (PBS) outperforms CKKS by offering exact lookup-table evaluation. But it lacks high-precision implementations of LLM nonlinear layers and underutilizes GPU resources. We propose \emph{TIGER}, the first GPU-accelerated framework for high-precision TFHE-based nonlinear LLM layer evaluation. TIGER offers: (1) GPU-optimized WoP-PBS method combined with numerical algorithms to surpass native lookup-table precision limits on nonlinear functions; (2) high-precision and efficient implementations of key nonlinear layers, enabling practical encrypted inference; (3) batch-driven design exploiting inter-input parallelism to boost GPU efficiency. TIGER achieves 7.17$\times$, 16.68$\times$, and 17.05$\times$ speedups over a CPU baseline for GELU, Softmax, and LayerNorm, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes