Decoupled Residual Quantization for Robust Semantic IDs in Recommendation
For practitioners building recommendation systems with semantic IDs, this work provides diagnostic tools and a new method, though the findings are incremental and based on a case study.
The paper introduces a quantitative framework to diagnose tokenizer failures in semantic ID generation for recommendation, and proposes Decoupled Residual Quantization (DRQ) to improve robustness. Experiments on an industrial dataset show that semantic ID quality involves multiple objectives, but results are limited to a single proprietary dataset.
Semantic IDs represent items as shared discrete token sequences and have become a practical tool for recommendation and retrieval. Yet it remains difficult to tell why a tokenizer fails: poor quality may come from codebook underutilization, unstable decision boundaries, or geometric distortion of the embedding space. This paper develops a quantitative framework for diagnosing these failures through expected codeword overlap and effective codebook capacity. The former measures expected codeword confusion under retrieval-time perturbation, while the latter converts that confusion into an effective number of usable, well-separated codes. The framework links semantic boundary confusion to both code usage imbalance and Euclidean geometric constraints. As a proof of concept, we present Decoupled Residual Quantization (DRQ), which separates continuous geometry reconstruction from discrete distribution matching. Experiments on a large-scale industrial dataset show that Semantic ID quality is multi-objective: symbolic robustness, reconstruction fidelity, and behavior-aware soft matching each stress different aspects of a tokenizer. These downstream observations are based on one proprietary industrial dataset, so they should be read as a case study rather than a universal benchmark claim.