Harnessing the Power of the Crowd to Increase Capacity for Data Science in the Social Sector
This work addresses capacity constraints in data science for social sector organizations by leveraging crowdsourced competitions, though it is incremental in applying an existing model to new domains.
The paper presents three case studies where data science competitions were used to solve specific problems in education, public health, and government innovation, such as automatically tagging school budget items and predicting restaurant hygiene violations, with results including improved accuracy and actionable insights.
We present three case studies of organizations using a data science competition to answer a pressing question. The first is in education where a nonprofit that creates smart school budgets wanted to automatically tag budget line items. The second is in public health, where a low-cost, nonprofit women's health care provider wanted to understand the effect of demographic and behavioral questions on predicting which services a woman would need. The third and final example is in government innovation: using online restaurant reviews from Yelp, competitors built models to forecast which restaurants were most likely to have hygiene violations when visited by health inspectors. Finally, we reflect on the unique benefits of the open, public competition model.