Ziyun Xu

4papers

44citations

Novelty45%

AI Score45

Ranked #65,685 of 205,806 authors (top 32%)#3,907 in AI (top 27%)

4 Papers

CLApr 1, 2022

Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning

Ziyun Xu, Chengyu Wang, Minghui Qiu et al. · cmu

Pre-trained Language Models (PLMs) have achieved remarkable performance for various language understanding tasks in IR systems, which require the fine-tuning process based on labeled training data. For low-resource scenarios, prompt-based learning for PLMs exploits prompts as task guidance and turns downstream tasks into masked language problems for effective few-shot fine-tuning. In most existing approaches, the high performance of prompt-based learning heavily relies on handcrafted prompts and verbalizers, which may limit the application of such approaches in real-world scenarios. To solve this issue, we present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning PLMs without any manual engineering of task-specific prompts and verbalizers. It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters. We further propose the pair-wise cost-sensitive contrastive learning procedure to optimize the model in order to achieve verbalizer-free class mapping and enhance the task-invariance of prompts. It explicitly learns to distinguish different classes and makes the decision boundary smoother by assigning different costs to easy and hard cases. Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.

15.8AIMay 24

Decoding ML Decision: An Agentic Reasoning Framework for Large-Scale Ranking System

Longfei Yun, Yihan Wu, Haoran Liu et al.

Modern large-scale ranking systems operate within a sophisticated landscape of competing objectives, operational constraints, and evolving product requirements. Progress in this domain is increasingly bottlenecked by the engineering context constraint: the arduous process of translating ambiguous product intent into reasonable, executable, verifiable hypotheses, rather than by modeling techniques alone. We present GEARS (Generative Engine for Agentic Ranking Systems), a framework that reframes ranking optimization as an autonomous discovery process within a programmable experimentation environment. Rather than treating optimization as static model selection, GEARS leverages Specialized Agent Skills to encapsulate ranking expert knowledge into reusable reasoning capabilities, enabling operators to steer systems via high-level intent vibe personalization. Furthermore, to ensure production reliability, the framework incorporates validation hooks to enforce statistical robustness and filter out brittle policies that overfit short-term signals. Experimental validation across diverse product surfaces demonstrates that GEARS consistently identifies superior, near-Pareto-efficient policies by synergizing algorithmic signals with deep ranking context while maintaining rigorous deployment stability.

10.8IRMay 10

A General Framework for Multimodal LLM-Based Multimedia Understanding in Large-Scale Recommendation Systems

Yiming Zhu, Xu Liu, Ziyun Xu et al.

Conventional recommendation systems frequently fail to fully exploit the high-dimensional semantic signals inherent in multimedia content, thereby limiting the fidelity of user preference modeling. While Multimodal Large Language Models (MM-LLMs) offer robust mechanisms for interpreting such complex data, their integration into latency-constrained, industrial-scale architectures remains a significant challenge. To address this, we propose a generalized framework for MM-LLM-driven multimedia understanding. Our methodology employs a tripartite architecture encompassing content interpretation, representation extraction, and systematic pipeline integration, instantiated via a LLaMA2-based model that generates descriptive captions subsequently ingested as tokenized categorical features. Empirical evaluation demonstrates the efficacy of this approach, yielding a $0.35\%$ increase in offline AUC and a $0.02\%$ improvement in online metrics at scale, substantiating the practical viability of leveraging MM-LLMs to enhance large-scale recommendation performance.

7.0PLApr 24

From Monolithic to Compositional: A Compositional Operational Semantics for Crystality

Ziyun Xu, Hao Wang, Meng Sun

Parallel execution has become a key approach to improving blockchain scalability, but the lack of formal semantics for smart contract languages in such settings makes rigorous reasoning difficult. Crystality is a smart contract language designed for parallel EVMs, supporting scoped state and asynchronous relay across execution engines. This paper introduces a compositional operational semantics for Crystality. Unlike the original monolithic semantics, the new semantics decomposes the system into engine components and a global component, making the structure of parallel execution explicit. The compositional formulation enables simple proofs of key structural properties, including locality, global isolation, and strong commutativity of independent local steps. Furthermore, we prove that the compositional semantics is semantically equivalent to the original one via a transaction-level bisimulation theorem based on encoding and decoding functions between configurations, and two code-level bisimulation theorems for local and global execution.