CVMar 11, 2025

TSCnet: A Text-driven Semantic-level Controllable Framework for Customized Low-Light Image Enhancement

Miao Zhang, Jun Yin, Pengyu Zeng, Yiqing Shen, Shuai Lu, Xueqian Wang

arXiv:2503.08168v119.725 citationsh-index: 10Neurocomputing

Originality Highly original

AI Analysis

This addresses the need for personalized image enhancement for users who want semantic-level control over lighting adjustments in low-light conditions, representing a novel approach rather than an incremental improvement.

The paper tackles the inflexibility of existing low-light image enhancement methods by proposing TSCnet, a framework that enables customized lighting adjustments through natural language prompts, achieving superior performance in visibility, color balance, and detail preservation without artifacts on benchmark datasets.

Deep learning-based image enhancement methods show significant advantages in reducing noise and improving visibility in low-light conditions. These methods are typically based on one-to-one mapping, where the model learns a direct transformation from low light to specific enhanced images. Therefore, these methods are inflexible as they do not allow highly personalized mapping, even though an individual's lighting preferences are inherently personalized. To overcome these limitations, we propose a new light enhancement task and a new framework that provides customized lighting control through prompt-driven, semantic-level, and quantitative brightness adjustments. The framework begins by leveraging a Large Language Model (LLM) to understand natural language prompts, enabling it to identify target objects for brightness adjustments. To localize these target objects, the Retinex-based Reasoning Segment (RRS) module generates precise target localization masks using reflection images. Subsequently, the Text-based Brightness Controllable (TBC) module adjusts brightness levels based on the generated illumination map. Finally, an Adaptive Contextual Compensation (ACC) module integrates multi-modal inputs and controls a conditional diffusion model to adjust the lighting, ensuring seamless and precise enhancements accurately. Experimental results on benchmark datasets demonstrate our framework's superior performance at increasing visibility, maintaining natural color balance, and amplifying fine details without creating artifacts. Furthermore, its robust generalization capabilities enable complex semantic-level lighting adjustments in diverse open-world environments through natural language interactions.

View on arXiv PDF

Similar