CVJul 16, 2025

Learning Pixel-adaptive Multi-layer Perceptrons for Real-time Image Enhancement

arXiv:2507.12135v12 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work solves the problem of achieving high-quality, real-time image enhancement for applications like photography or video processing, though it appears incremental as it builds on existing bilateral grid and MLP techniques.

The paper tackled the problem of real-time image enhancement by addressing limitations in existing bilateral grid and MLP methods, proposing a BPAM framework that synergizes spatial modeling with non-linear capabilities, resulting in outperforming state-of-the-art methods while maintaining real-time processing.

Deep learning-based bilateral grid processing has emerged as a promising solution for image enhancement, inherently encoding spatial and intensity information while enabling efficient full-resolution processing through slicing operations. However, existing approaches are limited to linear affine transformations, hindering their ability to model complex color relationships. Meanwhile, while multi-layer perceptrons (MLPs) excel at non-linear mappings, traditional MLP-based methods employ globally shared parameters, which is hard to deal with localized variations. To overcome these dual challenges, we propose a Bilateral Grid-based Pixel-Adaptive Multi-layer Perceptron (BPAM) framework. Our approach synergizes the spatial modeling of bilateral grids with the non-linear capabilities of MLPs. Specifically, we generate bilateral grids containing MLP parameters, where each pixel dynamically retrieves its unique transformation parameters and obtain a distinct MLP for color mapping based on spatial coordinates and intensity values. In addition, we propose a novel grid decomposition strategy that categorizes MLP parameters into distinct types stored in separate subgrids. Multi-channel guidance maps are used to extract category-specific parameters from corresponding subgrids, ensuring effective utilization of color information during slicing while guiding precise parameter generation. Extensive experiments on public datasets demonstrate that our method outperforms state-of-the-art methods in performance while maintaining real-time processing capabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes