CVDec 3, 2025

UniLight: A Unified Representation for Lighting

arXiv:2512.04267v1h-index: 20
Originality Incremental advance
AI Analysis

This addresses the challenge of cross-modal transfer in lighting representation for computer vision and graphics applications, though it appears incremental as it builds on existing contrastive and multi-modal methods.

The paper tackled the problem of incompatible lighting representations across modalities by proposing UniLight, a joint latent space that unifies text, images, irradiance, and environment maps, enabling flexible manipulation across tasks like retrieval and image synthesis.

Lighting has a strong influence on visual appearance, yet understanding and representing lighting in images remains notoriously difficult. Various lighting representations exist, such as environment maps, irradiance, spherical harmonics, or text, but they are incompatible, which limits cross-modal transfer. We thus propose UniLight, a joint latent space as lighting representation, that unifies multiple modalities within a shared embedding. Modality-specific encoders for text, images, irradiance, and environment maps are trained contrastively to align their representations, with an auxiliary spherical-harmonics prediction task reinforcing directional understanding. Our multi-modal data pipeline enables large-scale training and evaluation across three tasks: lighting-based retrieval, environment-map generation, and lighting control in diffusion-based image synthesis. Experiments show that our representation captures consistent and transferable lighting features, enabling flexible manipulation across modalities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes