Using novel data and ensemble models to improve automated labeling of Sustainable Development Goals
This work helps researchers and policymakers improve automated monitoring of SDGs by highlighting system biases and advocating for ensemble methods, though it is incremental as it builds on existing labeling approaches.
The study addressed the variability and biases in existing text-based labeling systems for Sustainable Development Goals (SDGs) by comparing them across different text sources, and found that an ensemble model outperformed all current systems in labeling performance.
A number of labeling systems based on text have been proposed to help monitor work on the United Nations (UN) Sustainable Development Goals (SDGs). Here, we present a systematic comparison of systems using a variety of text sources and show that systems differ considerably in their specificity (i.e., true-positive rate) and sensitivity (i.e., true-negative rate), have systematic biases (e.g., are more sensitive to specific SDGs relative to others), and are susceptible to the type and amount of text analyzed. We then show that an ensemble model that pools labeling systems alleviates some of these limitations, exceeding the labeling performance of all currently available systems. We conclude that researchers and policymakers should care about the choice of labeling system and that ensemble methods should be favored when drawing conclusions about the absolute and relative prevalence of work on the SDGs based on automated methods.