LGAIMay 2

Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels

arXiv:2605.1520857.4
Predicted impact top 44% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners deploying compressed LLMs, this reveals that standard quality metrics are insufficient to detect bias emergence, necessitating explicit fairness testing before deployment.

Quantization of LLMs causes new stereotypical biases to emerge in 6-21% of previously unbiased items at 3-bit precision, with a dose-response pattern, while standard metrics like perplexity show minimal change. Aggregate metrics systematically miss fairness-critical degradation.

Large Language Models are routinely compressed via post-training quantization to reduce inference costs and memory footprint for cloud and edge deployment, yet the impact of this compression on model quality remains poorly understood. Existing studies typically compare only two conditions (full-precision vs. a single quantized variant), rely on aggregate bias metrics, and evaluate a single model family, making it impossible to distinguish gradual degradation from threshold-dependent safety failures. We conduct a controlled empirical study of three instruction-tuned models (Qwen2.5-7B, Mistral-7B, Phi-3.5-mini) at five precision levels (BF16 through 3-bit) on 12,148 BBQ bias benchmark items across 5 random seeds, totaling 911,100 inference records. Our results reveal that 3-bit quantization causes 6-21% of previously unbiased items to develop new stereotypical behaviors, following a clear dose-response pattern confirmed via logistic regression, while models' willingness to select "unknown" answers declines by 17.4%. Crucially, these item-level changes are invisible to standard quality metrics: perplexity increases by less than 0.5% at 8-bit and under 3% at 4-bit across all three models, yet 2.5-5.6% of items already develop new biases at 4-bit. These findings demonstrate that aggregate evaluation metrics systematically miss fairness-critical degradation, underscoring the need for quality-aware compression protocols that explicitly test for bias emergence before deployment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes