Optimal control towards sustainable wastewater treatment plants based on multi-agent reinforcement learning
It addresses the problem of sustainable operation for WWTPs, which are resource-intensive and polluting, though the approach is incremental as it applies a known AI method to a specific domain.
This study tackled the optimization of wastewater treatment plants (WWTPs) to reduce environmental impacts and costs using multi-agent deep reinforcement learning, achieving reductions in cost to 0.890 CNY/m³-ww, energy consumption to 0.530 kWh/m³-ww, and greenhouse gas emissions to 2.491 kg CO₂-eq/m³-ww compared to a baseline.
Wastewater treatment plants are designed to eliminate pollutants and alleviate environmental pollution. However, the construction and operation of WWTPs consume resources, emit greenhouse gases (GHGs) and produce residual sludge, thus require further optimization. WWTPs are complex to control and optimize because of high nonlinearity and variation. This study used a novel technique, multi-agent deep reinforcement learning, to simultaneously optimize dissolved oxygen and chemical dosage in a WWTP. The reward function was specially designed from life cycle perspective to achieve sustainable optimization. Five scenarios were considered: baseline, three different effluent quality and cost-oriented scenarios. The result shows that optimization based on LCA has lower environmental impacts compared to baseline scenario, as cost, energy consumption and greenhouse gas emissions reduce to 0.890 CNY/m3-ww, 0.530 kWh/m3-ww, 2.491 kg CO2-eq/m3-ww respectively. The cost-oriented control strategy exhibits comparable overall performance to the LCA driven strategy since it sacrifices environmental bene ts but has lower cost as 0.873 CNY/m3-ww. It is worth mentioning that the retrofitting of WWTPs based on resources should be implemented with the consideration of impact transfer. Specifically, LCA SW scenario decreases 10 kg PO4-eq in eutrophication potential compared to the baseline within 10 days, while significantly increases other indicators. The major contributors of each indicator are identified for future study and improvement. Last, the author discussed that novel dynamic control strategies required advanced sensors or a large amount of data, so the selection of control strategies should also consider economic and ecological conditions.