Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs
This work addresses improving early warning systems for shellfish toxicity to support sustainable aquaculture practices, but it is incremental as it applies existing explainable ML techniques to a new long-term dataset.
The study tackled predicting shellfish toxicity from harmful algal blooms in the Adriatic Sea using a 28-year dataset, finding that a random forest model best predicted diarrhetic shellfish poisoning events with key predictors identified through explainability methods.
In this study, explainable machine learning techniques are applied to predict the toxicity of mussels in the Gulf of Trieste (Adriatic Sea) caused by harmful algal blooms. By analysing a newly created 28-year dataset containing records of toxic phytoplankton in mussel farming areas and toxin concentrations in mussels (Mytilus galloprovincialis), we train and evaluate the performance of ML models to accurately predict diarrhetic shellfish poisoning (DSP) events. The random forest model provided the best prediction of positive toxicity results based on the F1 score. Explainability methods such as permutation importance and SHAP identified key species (Dinophysis fortii and D. caudata) and environmental factors (salinity, river discharge and precipitation) as the best predictors of DSP outbreaks. These findings are important for improving early warning systems and supporting sustainable aquaculture practices.