CVMar 6, 2024

HDRFlow: Real-Time HDR Video Reconstruction with Large Motions

arXiv:2403.03447v131 citationsh-index: 12CVPR
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient HDR video reconstruction for applications like photography and videography, though it is incremental as it builds on existing flow-based alignment methods.

The paper tackles the problem of reconstructing High Dynamic Range (HDR) video from sequences with alternating exposures, especially under large motions, by proposing HDRFlow, a real-time method that achieves processing of 720p inputs in 25ms and outperforms previous methods on benchmarks.

Reconstructing High Dynamic Range (HDR) video from image sequences captured with alternating exposures is challenging, especially in the presence of large camera or object motion. Existing methods typically align low dynamic range sequences using optical flow or attention mechanism for deghosting. However, they often struggle to handle large complex motions and are computationally expensive. To address these challenges, we propose a robust and efficient flow estimator tailored for real-time HDR video reconstruction, named HDRFlow. HDRFlow has three novel designs: an HDR-domain alignment loss (HALoss), an efficient flow network with a multi-size large kernel (MLK), and a new HDR flow training scheme. The HALoss supervises our flow network to learn an HDR-oriented flow for accurate alignment in saturated and dark regions. The MLK can effectively model large motions at a negligible cost. In addition, we incorporate synthetic data, Sintel, into our training dataset, utilizing both its provided forward flow and backward flow generated by us to supervise our flow network, enhancing our performance in large motion regions. Extensive experiments demonstrate that our HDRFlow outperforms previous methods on standard benchmarks. To the best of our knowledge, HDRFlow is the first real-time HDR video reconstruction method for video sequences captured with alternating exposures, capable of processing 720p resolution inputs at 25ms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes