CVJun 1

Policy-based Foveated Imaging and Perception

arXiv:2606.0256534.0
AI Analysis

For computer vision and imaging systems, this work addresses the bottleneck of bandwidth and power constraints by enabling task-driven, acquisition-time foveation, offering a practical solution for high-resolution perception.

The paper introduces a real-time, task-aware foveated imaging system that dynamically allocates limited pixel bandwidth to task-relevant regions, achieving high task performance under strict pixel budgets and outperforming baselines at the same bandwidth, validated on a 200-megapixel dual-stream sensor.

Ultra-high-resolution image sensors offer the potential to capture fine spatial details critical for many visual perception tasks, but acquiring and processing all pixels at full resolution is often infeasible under realistic bandwidth, latency, and power constraints. Existing approaches address this challenge through acquisition strategies such as spatial or temporal downsampling, which irrevocably discard information before task relevance can be assessed. In this work, we introduce a real-time, predictive, and task-aware foveated imaging system that operates directly at image acquisition time. Leveraging emerging dual-stream sensor architectures, our method dynamically allocates limited pixel bandwidth to task-relevant regions of interest while maintaining a low-resolution global context. We formulate foveated acquisition as a sensor attention policy-learning problem, in which past observations guide actions that determine future measurements, closing the perception-acquisition loop. Through extensive simulation across multiple perception tasks, we demonstrate that our approach achieves high task performance under strict pixel budgets and significantly outperforms relevant baselines operating at the same bandwidth. We further validate our system on a 200-megapixel dual-stream sensor, capturing real-world videos under realistic bandwidth and latency constraints, demonstrating the practical feasibility of task-driven, acquisition-time foveated imaging.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes