5 Papers

CLFeb 11Code
How Do Decoder-Only LLMs Perceive Users? Rethinking Attention Masking for User Representation Learning

Jiahao Yuan, Yike Xu, Jinyong Wen et al.

Decoder-only large language models are increasingly used as behavioral encoders for user representation learning, yet the impact of attention masking on the quality of user embeddings remains underexplored. In this work, we conduct a systematic study of causal, hybrid, and bidirectional attention masks within a unified contrastive learning framework trained on large-scale real-world Alipay data that integrates long-horizon heterogeneous user behaviors. To improve training dynamics when transitioning from causal to bidirectional attention, we propose Gradient-Guided Soft Masking, a gradient-based pre-warmup applied before a linear scheduler that gradually opens future attention during optimization. Evaluated on 9 industrial user cognition benchmarks covering prediction, preference, and marketing sensitivity tasks, our approach consistently yields more stable training and higher-quality bidirectional representations compared with causal, hybrid, and scheduler-only baselines, while remaining compatible with decoder pretraining. Overall, our findings highlight the importance of masking design and training transition in adapting decoder-only LLMs for effective user representation learning. Our code is available at https://github.com/JhCircle/Deepfind-GGSM.

SYApr 15
Exploiting Scheduling Flexibility via State-Based Scheduling When Guaranteeing Worst-Case Services

Yike Xu, Mark S. Andersland

Even when providing long-run, worst-case guarantees to competing flows of unit-sized tasks, a slot-timed, constant-capacity server's scheduler may retain significant, short-run, scheduling flexibility. Existing worst-case scheduling frameworks offer only limited opportunities to characterize and exploit this flexibility. We introduce a state-based framework that overcomes these limitations. Each flow's guarantee is modeled as a worst-case service that can be updated as tasks arrive and are served. Taking all flows' worst-case services as a collective state, a state-based scheduler ensures, from slot to slot, transitions between schedulable states. This constrains its scheduling flexibility to a polytope consisting of all feasible schedules that preserve schedulability. We fully characterize this polytope, enabling scheduling flexibility to be fully exploited. But, as our framework is general, full exploitation is computationally complex. To reduce complexity, we show: that when feasible schedules exist, at least one can be efficiently identified by simply maximizing the server's capacity slack; that a special class of worst-case services, min-plus services, can be efficiently specified and updated using the min-plus algebra; and that efficiency can be further improved by restricting attention to a min-plus service subclass, dual-curve services. This last specialization turns out to be a dynamic extension of service curves that approaches near practical viability while maintaining all features essential to our framework.

CLFeb 16Code
Query as Anchor: Scenario-Adaptive User Representation via Large Language Model

Jiahao Yuan, Yike Xu, Jinyong Wen et al.

Industrial-scale user representation learning requires balancing robust universality with acute task-sensitivity. However, existing paradigms primarily yield static, task-agnostic embeddings that struggle to reconcile the divergent requirements of downstream scenarios within unified vector spaces. Furthermore, heterogeneous multi-source data introduces inherent noise and modality conflicts, degrading representation. We propose Query-as-Anchor, a framework shifting user modeling from static encoding to dynamic, query-aware synthesis. To empower Large Language Models (LLMs) with deep user understanding, we first construct UserU, an industrial-scale pre-training dataset that aligns multi-modal behavioral sequences with user understanding semantics, and our Q-Anchor Embedding architecture integrates hierarchical coarse-to-fine encoders into dual-tower LLMs via joint contrastive-autoregressive optimization for query-aware user representation. To bridge the gap between general pre-training and specialized business logic, we further introduce Cluster-based Soft Prompt Tuning to enforce discriminative latent structures, effectively aligning model attention with scenario-specific modalities. For deployment, anchoring queries at sequence termini enables KV-cache-accelerated inference with negligible incremental latency. Evaluations on 10 Alipay industrial benchmarks show consistent SOTA performance, strong scalability, and efficient deployment. Large-scale online A/B testing in Alipay's production system across two real-world scenarios further validates its practical effectiveness. Our code is prepared for public release and will be available at: https://github.com/JhCircle/Q-Anchor.

LGDec 17, 2024
Transferable and Forecastable User Targeting Foundation Model

Bin Dou, Baokun Wang, Yun Zhu et al.

User targeting, the process of selecting targeted users from a pool of candidates for non-expert marketers, has garnered substantial attention with the advancements in digital marketing. However, existing user targeting methods encounter two significant challenges: (i) Poor cross-domain and cross-scenario transferability and generalization, and (ii) Insufficient forecastability in real-world applications. These limitations hinder their applicability across diverse industrial scenarios. In this work, we propose FOUND, an industrial-grade, transferable, and forecastable user targeting foundation model. To enhance cross-domain transferability, our framework integrates heterogeneous multi-scenario user data, aligning them with one-sentence targeting demand inputs through contrastive pre-training. For improved forecastability, the text description of each user is derived based on anticipated future behaviors, while user representations are constructed from historical information. Experimental results demonstrate that our approach significantly outperforms existing baselines in cross-domain, real-world user targeting scenarios, showcasing the superior capabilities of FOUND. Moreover, our method has been successfully deployed on the Alipay platform and is widely utilized across various scenarios.

LGOct 13, 2025
Instruction-aware User Embedding via Synergistic Language and Representation Modeling

Ziyi Gao, Yike Xu, Jiahao Yuan et al.

User representation modeling has become increasingly crucial for personalized applications, yet existing approaches struggle with generalizability across domains and sensitivity to noisy behavioral signals. We present InstructUE, an instruction-aware user embedding foundation model that leverages large language models (LLMs) to generate general and instruction-aware user representations. InstructUE introduces a multi-encoder architecture with a lightweight adapter that efficiently processes heterogeneous data from six different sources while preserving their structural characteristics. Additionally, it proposes a novel contrastive-autoregressive training framework that bridges language and representation spaces through a curated UserQA dataset. The contrastive-autoregressive training framework simultaneously leverages autoregressive learning to capture domain knowledge in language space and contrastive learning to align user-text embeddings in representation space, thereby enhancing the instruction-awareness and noise-robustness of user embeddings. Through extensive experiments on real-world applications, we demonstrate that InstructUE significantly outperforms existing methods across multiple domains including user prediction, marketing, and recommendation scenarios. Our results show that instruction-aware user modeling can effectively achieve instruction-guided denoising of user information in specific scenarios, paving the way for more generalizable and robust user representation learning.