CVAIMar 1

Improving Text-to-Image Generation with Intrinsic Self-Confidence Rewards

arXiv:2603.00918v11 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses the need for better text-to-image models in content creation and data augmentation, offering an incremental improvement by combining internal signals with external rewards to reduce reward hacking.

The paper tackles the problem of improving text-to-image generation by introducing ARC, a post-training framework that uses intrinsic self-confidence rewards instead of external supervision, resulting in consistent gains in compositional generation, text rendering, and text-image alignment over baselines.

Text-to-image generation powers content creation across design, media, and data augmentation. Post-training of text-to-image generative models is a promising path to better match human preferences, factuality, and improved aesthetics. We introduce ARC (Adaptive Rewarding by self-Confidence), a post-training framework that replaces external reward supervision with an internal self-confidence signal, obtained by evaluating how accurately the model recovers injected noise under self-denoising probes. ARC converts this intrinsic signal into scalar rewards, enabling fully unsupervised optimization without additional datasets, annotators, or reward models. Empirically, by reinforcing high-confidence generations, ARC delivers consistent gains in compositional generation, text rendering and text-image alignment over the baseline. We also find that integrating ARC with external rewards results in a complementary improvement, with alleviated reward hacking.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes