Feng Pan

3.6CVJan 25, 2025

Vision without Images: End-to-End Computer Vision from Single Compressive Measurements

Fengpu Pan, Heting Gao, Jiangtao Wen et al.

Snapshot Compressed Imaging (SCI) offers high-speed, low-bandwidth, and energy-efficient image acquisition, but remains challenged by low-light and low signal-to-noise ratio (SNR) conditions. Moreover, practical hardware constraints in high-resolution sensors limit the use of large frame-sized masks, necessitating smaller, hardware-friendly designs. In this work, we present a novel SCI-based computer vision framework using pseudo-random binary masks of only 8$\times$8 in size for physically feasible implementations. At its core is CompDAE, a Compressive Denoising Autoencoder built on the STFormer architecture, designed to perform downstream tasks--such as edge detection and depth estimation--directly from noisy compressive raw pixel measurements without image reconstruction. CompDAE incorporates a rate-constrained training strategy inspired by BackSlash to promote compact, compressible models. A shared encoder paired with lightweight task-specific decoders enables a unified multi-task platform. Extensive experiments across multiple datasets demonstrate that CompDAE achieves state-of-the-art performance with significantly lower complexity, especially under ultra-low-light conditions where traditional CMOS and SCI pipelines fail.

4.1LGJan 14, 2025

BiDepth: A Bidirectional-Depth Neural Network for Spatio-Temporal Prediction

Sina Ehsani, Fenglian Pan, Qingpei Hu et al.

Accurate spatial-temporal (ST) prediction for dynamic systems, such as urban mobility and weather patterns, is crucial but hindered by complex ST correlations and the challenge of concurrently modeling long-term trends with short-term fluctuations. Existing methods often falter in these areas. This paper proposes the BiDepth Multimodal Neural Network (BDMNN), which integrates two key innovations: 1) a bidirectional depth modulation mechanism that dynamically adjusts network depth to comprehensively capture both long-term seasonality and immediate short-term events; and 2) a novel convolutional self-attention cell (CSAC). Critically, unlike many attention mechanisms that can lose spatial acuity, our CSAC is specifically designed to preserve crucial spatial relationships throughout the network, akin to standard convolutional layers, while simultaneously capturing temporal dependencies. Evaluated on real-world urban traffic and precipitation datasets, BDMNN demonstrates significant accuracy improvements, achieving a 12% Mean Squared Error (MSE) reduction in urban traffic prediction and a 15% improvement in precipitation forecasting over leading deep learning benchmarks like ConvLSTM, using comparable computational resources. These advancements offer robust ST forecasting for smart city management, disaster prevention, and resource optimization.

Feng Pan

2 Papers