Yufei Tao

h-index65

4papers

87citations

Novelty46%

AI Score44

Ranked #50,296 of 194,257 authors (top 26%)#10,009 in CL (top 33%)

4 Papers

5.1LGMay 5

Distributed Learning with Adversarial Gradient Perturbations

Nawapon Sangsiri, Yufei Tao

Privacy concerns in distributed learning often lead clients to return intentionally altered gradient information. We consider the problem of learning convex and $L$-smooth functions under adversarial gradient perturbation, where a client's gradient reply to a server query can deviate arbitrarily from the true gradient subject to a distance bound. Our study focuses on two fundamental questions: (i) what is the smallest achievable sub-optimality gap (i.e., excess error in optimization) under such responses, and (ii) how many queries are sufficient to guarantee a given sub-optimality gap? We establish tight feasibility thresholds on the sub-optimality gap and provide algorithms that achieve these thresholds with provable query complexity guarantees.

8.3CLNov 10, 2025

CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences

Rhitabrat Pokharel, Yufei Tao, Ameeta Agrawal

Preference optimization is a critical post-training technique used to align large language models (LLMs) with human preferences, typically by fine-tuning on ranked response pairs. While methods like Direct Preference Optimization (DPO) have proven effective in English, they often fail to generalize robustly to multilingual settings. We propose a simple yet effective alternative, Confidence-Aware Preference Optimization (CAPO), which replaces DPO's fixed treatment of preference pairs with a dynamic loss scaling mechanism based on a relative reward. By modulating the learning signal according to the confidence in each preference pair, CAPO enhances robustness to noisy or low-margin comparisons, typically encountered in multilingual text. Empirically, CAPO outperforms existing preference optimization baselines by at least 16% in reward accuracy, and improves alignment by widening the gap between preferred and dispreferred responses across languages.

24.6CLMar 26, 2024

ChatGPT Role-play Dataset: Analysis of User Motives and Model Naturalness

Yufei Tao, Ameeta Agrawal, Judit Dombi et al.

Recent advances in interactive large language models like ChatGPT have revolutionized various domains; however, their behavior in natural and role-play conversation settings remains underexplored. In our study, we address this gap by deeply investigating how ChatGPT behaves during conversations in different settings by analyzing its interactions in both a normal way and a role-play setting. We introduce a novel dataset of broad range of human-AI conversations annotated with user motives and model naturalness to examine (i) how humans engage with the conversational AI model, and (ii) how natural are AI model responses. Our study highlights the diversity of user motives when interacting with ChatGPT and variable AI naturalness, showing not only the nuanced dynamics of natural conversations between humans and AI, but also providing new avenues for improving the effectiveness of human-AI communication.

4.1LGJun 12, 2025

Monotone Classification with Relative Approximations

Yufei Tao

In monotone classification, the input is a multi-set $P$ of points in $\mathbb{R}^d$, each associated with a hidden label from $\{-1, 1\}$. The goal is to identify a monotone function $h$, which acts as a classifier, mapping from $\mathbb{R}^d$ to $\{-1, 1\}$ with a small {\em error}, measured as the number of points $p \in P$ whose labels differ from the function values $h(p)$. The cost of an algorithm is defined as the number of points having their labels revealed. This article presents the first study on the lowest cost required to find a monotone classifier whose error is at most $(1 + ε) \cdot k^*$ where $ε\ge 0$ and $k^*$ is the minimum error achieved by an optimal monotone classifier -- in other words, the error is allowed to exceed the optimal by at most a relative factor. Nearly matching upper and lower bounds are presented for the full range of $ε$. All previous work on the problem can only achieve an error higher than the optimal by an absolute factor.