CVSep 20, 2025

PM25Vision: A Large-Scale Benchmark Dataset for Visual Estimation of Air Quality

arXiv:2509.16519v11 citations
Originality Synthesis-oriented
AI Analysis

This provides a benchmark for researchers in environmental monitoring and computer vision to improve air quality estimation, though it is incremental as it focuses on dataset creation rather than novel methods.

The authors tackled the problem of estimating PM2.5 concentrations from street-level images by introducing PM25Vision, a large-scale dataset with over 11,114 images matched to PM2.5 readings across 3,261 stations over 11 years, achieving spatial accuracy of 5 kilometers.

We introduce PM25Vision (PM25V), the largest and most comprehensive dataset to date for estimating air quality - specifically PM2.5 concentrations - from street-level images. The dataset contains over 11,114 images matched with timestamped and geolocated PM2.5 readings across 3,261 AQI monitoring stations and 11 years, significantly exceeding the scale of previous benchmarks. The spatial accuracy of this dataset has reached 5 kilometers, far exceeding the city-level accuracy of many datasets. We describe the data collection, synchronization, and cleaning pipelines, and provide baseline model performances using CNN and transformer architectures. Our dataset is publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes