LGNov 7, 2024

Hardware and Software Platform Inference

DeepMind
arXiv:2411.05197v26 citationsh-index: 26Has CodeICML
Originality Incremental advance
AI Analysis

This addresses a verification problem for businesses buying LLM inference services to prevent being overcharged for inferior hardware, though it is incremental as it builds on existing hardware fingerprinting concepts.

The paper tackles the problem of verifying the hardware and software platform used for large language model inference in black-box settings, where providers might misrepresent their services. The result is a method called HSPI that achieves up to 100% accuracy in white-box settings and up to 3x better than random in black-box settings for identifying GPU types and software stacks.

It is now a common business practice to buy access to large language model (LLM) inference rather than self-host, because of significant upfront hardware infrastructure and energy costs. However, as a buyer, there is no mechanism to verify the authenticity of the advertised service including the serving hardware platform, e.g. that it is actually being served using an NVIDIA H100. Furthermore, there are reports suggesting that model providers may deliver models that differ slightly from the advertised ones, often to make them run on less expensive hardware. That way, a client pays premium for a capable model access on more expensive hardware, yet ends up being served by a (potentially less capable) cheaper model on cheaper hardware. In this paper we introduce hardware and software platform inference (HSPI) -- a method for identifying the underlying GPU architecture and software stack of a (black-box) machine learning model solely based on its input-output behavior. Our method leverages the inherent differences of various GPU architectures and compilers to distinguish between different GPU types and software stacks. By analyzing the numerical patterns in the model's outputs, we propose a classification framework capable of accurately identifying the GPU used for model inference as well as the underlying software configuration. Our findings demonstrate the feasibility of inferring GPU type from black-box models. We evaluate HSPI against models served on different real hardware and find that in a white-box setting we can distinguish between different GPUs with between $83.9\%$ and $100\%$ accuracy. Even in a black-box setting we achieve results that are up to 3x higher than random guess accuracy. Our code is available at https://github.com/ChengZhang-98/HSPI.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes