A Survey on Private Transformer Inference
It tackles privacy issues for users of transformer-based AI services, but is incremental as it surveys existing methods.
This paper reviews Private Transformer Inference (PTI) to address privacy concerns in MLaaS by using cryptographic techniques like MPC and HE, aiming to enable secure model inference without exposing data or models.
Transformer models have revolutionized AI, enabling applications like content generation and sentiment analysis. However, their use in Machine Learning as a Service (MLaaS) raises significant privacy concerns, as centralized servers process sensitive user data. Private Transformer Inference (PTI) addresses these issues using cryptographic techniques such as Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE), enabling secure model inference without exposing inputs or models. This paper reviews recent advancements in PTI, analyzing state-of-the-art solutions, their challenges, and potential improvements. We also propose evaluation guidelines to assess resource efficiency and privacy guarantees, aiming to bridge the gap between high-performance inference and data privacy.