IR CLFeb 24

HiSAC: Hierarchical Sparse Activation Compression for Ultra-long Sequence Modeling in Recommenders

Kun Yuan, Junyu Bi, Daixuan Cheng, Changfa Wu, Shuwen Xiao, Binbin Cao, Jian Wu, Yuning Jiang

arXiv:2602.21009v14.21 citationsh-index: 5

Originality Incremental advance

AI Analysis

This addresses production constraints in large-scale recommender systems by improving efficiency and accuracy, though it is incremental as it builds on existing interest center methods.

The paper tackles the challenge of modeling ultra-long user behavior sequences in recommender systems by proposing HiSAC, which uses hierarchical sparse activation to compress sequences and reduce latency, achieving a 1.65% CTR uplift in online A/B tests on Taobao.

Modern recommender systems leverage ultra-long user behavior sequences to capture dynamic preferences, but end-to-end modeling is infeasible in production due to latency and memory constraints. While summarizing history via interest centers offers a practical alternative, existing methods struggle to (1) identify user-specific centers at appropriate granularity and (2) accurately assign behaviors, leading to quantization errors and loss of long-tail preferences. To alleviate these issues, we propose Hierarchical Sparse Activation Compression (HiSAC), an efficient framework for personalized sequence modeling. HiSAC encodes interactions into multi-level semantic IDs and constructs a global hierarchical codebook. A hierarchical voting mechanism sparsely activates personalized interest-agents as fine-grained preference centers. Guided by these agents, Soft-Routing Attention aggregates historical signals in semantic space, weighting by similarity to minimize quantization error and retain long-tail behaviors. Deployed on Taobao's "Guess What You Like" homepage, HiSAC achieves significant compression and cost reduction, with online A/B tests showing a consistent 1.65% CTR uplift -- demonstrating its scalability and real-world effectiveness.

View on arXiv PDF

Similar