CV RO IVMar 2, 2025

Unifying Light Field Perception with Field of Parallax

Fei Teng, Buyin Deng, Boyuan Zheng, Kai Luo, Kunyu Peng, Jiaming Zhang, Kailun Yang

arXiv:2503.00747v13.6h-index: 39Has Code

Originality Highly original

AI Analysis

This work addresses the challenge of handling arbitrary light field representations for multi-task vision, which is incremental as it builds upon existing methods but introduces a novel unifying approach.

The paper tackles the problem of unifying light field perception for multi-task learning by introducing the Field of Parallax (FoP) and LFX framework, achieving state-of-the-art results such as 84.74% mIoU in semantic segmentation and 0.84% AP in object detection.

Field of Parallax (FoP)}, a spatial field that distills the common features from different LF representations to provide flexible and consistent support for multi-task learning. FoP is built upon three core features--projection difference, adjacency divergence, and contextual consistency--which are essential for cross-task adaptability. To implement FoP, we design a two-step angular adapter: the first step captures angular-specific differences, while the second step consolidates contextual consistency to ensure robust representation. Leveraging the FoP-based representation, we introduce the LFX framework, the first to handle arbitrary LF representations seamlessly, unifying LF multi-task vision. We evaluated LFX across three different tasks, achieving new state-of-the-art results, compared with previous task-specific architectures: 84.74% in mIoU for semantic segmentation on UrbanLF, 0.84% in AP for object detection on PKU, and 0.030 in MAE and 0.026 in MAE for salient object detection on Duftv2 and PKU, respectively. The source code will be made publicly available at https://github.com/warriordby/LFX.

View on arXiv PDF Code

Similar