Youwei Zeng

CVNov 30, 2025Code

Multi-GRPO: Multi-Group Advantage Estimation for Text-to-Image Generation with Tree-Based Trajectories and Multiple Rewards

Qiang Lyu, Zicong Chen, Chongxiao Wang et al.

Recently, Group Relative Policy Optimization (GRPO) has shown promising potential for aligning text-to-image (T2I) models, yet existing GRPO-based methods suffer from two critical limitations. (1) \textit{Shared credit assignment}: trajectory-level advantages derived from group-normalized sparse terminal rewards are uniformly applied across timesteps, failing to accurately estimate the potential of early denoising steps with vast exploration spaces. (2) \textit{Reward-mixing}: predefined weights for combining multi-objective rewards (e.g., text accuracy, visual quality, text color)--which have mismatched scales and variances--lead to unstable gradients and conflicting updates. To address these issues, we propose \textbf{Multi-GRPO}, a multi-group advantage estimation framework with two orthogonal grouping mechanisms. For better credit assignment, we introduce tree-based trajectories inspired by Monte Carlo Tree Search: branching trajectories at selected early denoising steps naturally forms \emph{temporal groups}, enabling accurate advantage estimation for early steps via descendant leaves while amortizing computation through shared prefixes. For multi-objective optimization, we introduce \emph{reward-based grouping} to compute advantages for each reward function \textit{independently} before aggregation, disentangling conflicting signals. To facilitate evaluation of multiple objective alignment, we curate \textit{OCR-Color-10}, a visual text rendering dataset with explicit color constraints. Across the single-reward \textit{PickScore-25k} and multi-objective \textit{OCR-Color-10} benchmarks, Multi-GRPO achieves superior stability and alignment performance, effectively balancing conflicting objectives. Code will be publicly available at \href{https://github.com/fikry102/Multi-GRPO}{https://github.com/fikry102/Multi-GRPO}.

SPJul 9, 2019

FarSense: Pushing the Range Limit of WiFi-based Respiration Sensing with CSI Ratio of Two Antennas

Youwei Zeng, Dan Wu, Jie Xiong et al.

The past few years have witnessed the great potential of exploiting channel state information retrieved from commodity WiFi devices for respiration monitoring. However, existing approaches only work when the target is close to the WiFi transceivers and the performance degrades significantly when the target is far away. On the other hand, most home environments only have one WiFi access point and it may not be located in the same room as the target. This sensing range constraint greatly limits the application of the proposed approaches in real life. This paper presents FarSense--the first real-time system that can reliably monitor human respiration when the target is far away from the WiFi transceiver pair. FarSense works well even when one of the transceivers is located in another room, moving a big step towards real-life deployment. We propose two novel schemes to achieve this goal: (1) Instead of applying the raw CSI readings of individual antenna for sensing, we employ the ratio of CSI readings from two antennas, whose noise is mostly canceled out by the division operation to significantly increase the sensing range; (2) The division operation further enables us to utilize the phase information which is not usable with one single antenna for sensing. The orthogonal amplitude and phase are elaborately combined to address the "blind spots" issue and further increase the sensing range. Extensive experiments show that FarSense is able to accurately monitor human respiration even when the target is 8 meters away from the transceiver pair, increasing the sensing range by more than 100%. We believe this is the first system to enable through-wall respiration sensing with commodity WiFi devices and the proposed method could also benefit other sensing applications.

Youwei Zeng

2 Papers