Enhancing the interpretability of spatially variable N2O model predictions with soft sensors during wastewater treatment
For wastewater treatment operators, the work highlights that N2O soft sensor predictions are location- and dataset-dependent, reducing their generalizability.
This study analyzed how machine learning models predict spatially variable N2O emissions in wastewater treatment, achieving high accuracy (R² = 0.79–0.89 on real data, 0.97 ± 0.02 on simulated data), but found that feature importance varies with model, scenario, and measurement scale, limiting interpretability.
Model-based solutions for nitrous oxide (N2O) emissions from wastewater treatment plants (WWTP) are informed by operational datasets designed to control nutrient levels in liquid waste, coupled with dedicated campaigns for N2O measurements. We analysed how machine learning (ML) models predict disturbances to WWT operation and spatially variable N2O emissions. A real dataset was investigated to validate the modelling framework from N2O emissions predicted by four ML models (R2 = 0.79 - 0.89). Monitoring campaigns for N2O were simulated with a plant-wide mechanistic model to include additional sensors, site-level N2O datasets, and wastewater disturbances (n = 16). ML models were highly accurate (0.97 +- 0.02, n = 80), but the feature importance depended on the model, the scenario and the N2O measurement scale (reactor vs. WWTP). We argue that N2O soft sensor model predictions are limited to the measuring location and the methodological uncertainty of the dataset, which affect the interpretability of the model. Lastly, the analysis of the mechanistic model structure exposed interactions between autotrophic and heterotrophic pathways over nitric oxide which can overestimate aerobic nitrite production and bias the N2O pathway contributions.