CY LGJun 18, 2025

Using Machine Learning in Analyzing Air Quality Discrepancies of Environmental Impact

Shuangbao Paul Wang, Lucas Yang, Rahouane Chouchane, Jin Guo, Michael Bailey

arXiv:2506.17319v12024 International Conference on AI x Data and Knowledge Engineering (AIxDKE)

Originality Synthesis-oriented

AI Analysis

This research highlights environmental injustice in Baltimore, showing how historical policies continue to affect air quality for low-income and minority residents, though it is incremental in applying existing methods to new data.

The study analyzed air pollution disparities in Baltimore using machine learning on data from biased insurance risk estimates, demographics, and pollution concentrations, finding clear associations between pollution levels and biased methods, with significant disparities by income and ethnicity.

In this study, we apply machine learning and software engineering in analyzing air pollution levels in City of Baltimore. The data model was fed with three primary data sources: 1) a biased method of estimating insurance risk used by homeowners loan corporation, 2) demographics of Baltimore residents, and 3) census data estimate of NO2 and PM2.5 concentrations. The dataset covers 650,643 Baltimore residents in 44.7 million residents in 202 major cities in US. The results show that air pollution levels have a clear association with the biased insurance estimating method. Great disparities present in NO2 level between more desirable and low income blocks. Similar disparities exist in air pollution level between residents' ethnicity. As Baltimore population consists of a greater proportion of people of color, the finding reveals how decades old policies has continued to discriminate and affect quality of life of Baltimore citizens today.

View on arXiv PDF

Similar