CVApr 27

POCA: Pareto-Optimal Curriculum Alignment for Visual Text Generation

arXiv:2604.2417173.4
Predicted impact top 38% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For researchers in visual text generation, POCA provides a more stable and effective multi-reward alignment method that avoids the instability of weighted-sum approaches and improves training efficiency.

POCA addresses the trade-off between text accuracy and image coherence in visual text generation by formulating it as a multi-objective problem, using Pareto-optimal set identification and adaptive curriculum alignment. It achieves significant improvements across CLIP, HPS scores, and sentence accuracy.

Current visual text generation models struggle with the trade-off between text accuracy and overall image coherence. We find that achieving high text accuracy can reduce aesthetic quality and instruction-following capability. Although reinforcement learning approaches can alleviate the problem through aligning with multiple rewards, they are often unstable for text generation, as existing approaches normally optimize multiple rewards in a weighted-sum way. In addition, it is difficult to balance the weight of each reward. Moreover, reinforcement learning requires a set of training instructions. A large number of prompts require more training time and computing resources, while a small set leads to poor performance. Hence, how to select the prompts for efficient training is an unsolved problem. In this study, we propose Pareto-Optimal Curriculum Alignment (POCA), a framework that addresses this issue as a multi-objective problem by: 1) identifying the Pareto-optimal set to avoid simple scalarization and 2) designing an adaptive curriculum alignment strategy to manage a learning sequence of a multi-reward dataset using automatic difficulty assessment, which is crucial for optimal convergence as RL methods explore in a limited data environment. In synergy, POCA finds the Pareto-optimal set in a unified reward space, which eliminates inconsistent signals to find the best trade-off solution from different rewards under an easy-to-hard optimization landscape. The experimental results show that POCA significantly improves all metrics such as CLIP, HPS scores and sentence accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes