The World Wide recipe: A community-centred framework for fine-grained data collection and regional bias operationalisation
This addresses representational biases in AI systems that can reinforce stereotypes and erasure, particularly for underrepresented regions, though it is incremental in focusing on a specific domain.
The paper tackles the problem of regional bias in text-to-image models by introducing a culturally aware data collection framework and the World Wide Dishes dataset, finding that models underperform in generating accurate and culturally sensitive dish images, with US dishes outperforming African ones.
We introduce the World Wide recipe, which sets forth a framework for culturally aware and participatory data collection, and the resultant regionally diverse World Wide Dishes evaluation dataset. We also analyse bias operationalisation to highlight how current systems underperform across several dimensions: (in-)accuracy, (mis-)representation, and cultural (in-)sensitivity, with evidence from qualitative community-based observations and quantitative automated tools. We find that these T2I models generally do not produce quality outputs of dishes specific to various regions. This is true even for the US, which is typically considered more well-resourced in training data -- although the generation of US dishes does outperform that of the investigated African countries. The models demonstrate the propensity to produce inaccurate and culturally misrepresentative, flattening, and insensitive outputs. These representational biases have the potential to further reinforce stereotypes and disproportionately contribute to erasure based on region. The dataset and code are available at https://github.com/oxai/world-wide-dishes.