DCCVJul 26, 2025

A Fast Parallel Median Filtering Algorithm Using Hierarchical Tiling

arXiv:2507.19926v1SIGGRAPH
Originality Highly original
AI Analysis

This provides a faster solution for image processing tasks requiring noise removal, though it is incremental as it builds on existing sorting-based methods.

The paper tackles the high computational cost of median filtering in image processing by introducing a novel algorithm using hierarchical tiling, achieving up to 5 times faster performance than the state of the art on GPUs for various data types and kernel sizes.

Median filtering is a non-linear smoothing technique widely used in digital image processing to remove noise while retaining sharp edges. It is particularly well suited to removing outliers (impulse noise) or granular artifacts (speckle noise). However, the high computational cost of median filtering can be prohibitive. Sorting-based algorithms excel with small kernels but scale poorly with increasing kernel diameter, in contrast to constant-time methods characterized by higher constant factors but better scalability, such as histogram-based approaches or the 2D wavelet matrix. This paper introduces a novel algorithm, leveraging the separability of the sorting problem through hierarchical tiling to minimize redundant computations. We propose two variants: a data-oblivious selection network that can operate entirely within registers, and a data-aware version utilizing random-access memory. These achieve per-pixel complexities of $O(k \log(k))$ and $O(k)$, respectively, for a $k \times k$ kernel - unprecedented for sorting-based methods. Our CUDA implementation is up to 5 times faster than the current state of the art on a modern GPU and is the fastest median filter in most cases for 8-, 16-, and 32-bit data types and kernels from $3 \times 3$ to $75 \times 75$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes