CVLGApr 28, 2025

ClearVision: Leveraging CycleGAN and SigLIP-2 for Robust All-Weather Classification in Traffic Camera Imagery

arXiv:2504.19684v22 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the challenge of safe transportation by improving real-time weather detection from traffic cameras, offering a scalable and efficient solution for traffic management systems.

The paper tackled the problem of robust weather classification in traffic camera imagery, particularly in low-light nighttime conditions, by proposing a framework combining CycleGAN-based domain adaptation with SigLIP-2 and contrastive learning. The result was a model that achieved 85.90% nighttime accuracy, reduced the day-night performance gap from 33.81 to 8.90 percentage points, and cut training time by 89% and inference time by 80% compared to a baseline.

Adverse weather conditions challenge safe transportation, necessitating robust real-time weather detection from traffic camera imagery. We propose a novel framework combining CycleGAN-based domain adaptation with efficient contrastive learning to enhance weather classification, particularly in low-light nighttime conditions. Our approach leverages the lightweight SigLIP-2 model, which employs pairwise sigmoid loss to reduce computational demands, integrated with CycleGAN to transform nighttime images into day-like representations while preserving weather cues. Evaluated on an Iowa Department of Transportation dataset, the baseline EVA-02 model with CLIP achieves a per-class overall accuracy of 96.55\% across three weather conditions (No Precipitation, Rain, Snow) and a day/night overall accuracy of 96.55\%, but shows a significant day-night gap (97.21\% day vs.\ 63.40\% night). With CycleGAN, EVA-02 improves to 97.01\% per-class accuracy and 96.85\% day/night accuracy, boosting nighttime performance to 82.45\%. Our Vision-SigLIP-2 + Text-SigLIP-2 + CycleGAN + Contrastive configuration excels in nighttime scenarios, achieving the highest nighttime accuracy of 85.90\%, with 94.00\% per-class accuracy and 93.35\% day/night accuracy. This model reduces training time by 89\% (from 6 hours to 40 minutes) and inference time by 80\% (from 15 seconds to 3 seconds) compared to EVA-02. By narrowing the day-night performance gap from 33.81 to 8.90 percentage points, our framework provides a scalable, efficient solution for all-weather classification using existing camera infrastructure.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes