CVROIVMar 2, 2025

Unifying Light Field Perception with Field of Parallax

arXiv:2503.00747v1h-index: 39Has Code
Originality Highly original
AI Analysis

This work addresses the challenge of handling arbitrary light field representations for multi-task vision, which is incremental as it builds upon existing methods but introduces a novel unifying approach.

The paper tackles the problem of unifying light field perception for multi-task learning by introducing the Field of Parallax (FoP) and LFX framework, achieving state-of-the-art results such as 84.74% mIoU in semantic segmentation and 0.84% AP in object detection.

Field of Parallax (FoP)}, a spatial field that distills the common features from different LF representations to provide flexible and consistent support for multi-task learning. FoP is built upon three core features--projection difference, adjacency divergence, and contextual consistency--which are essential for cross-task adaptability. To implement FoP, we design a two-step angular adapter: the first step captures angular-specific differences, while the second step consolidates contextual consistency to ensure robust representation. Leveraging the FoP-based representation, we introduce the LFX framework, the first to handle arbitrary LF representations seamlessly, unifying LF multi-task vision. We evaluated LFX across three different tasks, achieving new state-of-the-art results, compared with previous task-specific architectures: 84.74% in mIoU for semantic segmentation on UrbanLF, 0.84% in AP for object detection on PKU, and 0.030 in MAE and 0.026 in MAE for salient object detection on Duftv2 and PKU, respectively. The source code will be made publicly available at https://github.com/warriordby/LFX.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes