Data Justice in Practice: A Guide for Developers
This addresses the need for practical guidance to mitigate social harms in AI/ML systems for developers and organizations, but it is incremental as it builds on existing data justice discussions.
The paper tackles the problem of discrimination and inequity in data-intensive technologies by providing a guide for developers to operationalize data justice principles throughout the AI/ML lifecycle, resulting in a framework with six pillars and five SAFE-D principles to support responsible and equitable system design.
The Advancing Data Justice Research and Practice project aims to broaden understanding of the social, historical, cultural, political, and economic forces that contribute to discrimination and inequity in contemporary ecologies of data collection, governance, and use. This is the consultation draft of a guide for developers and organisations, which are producing, procuring, or using data-intensive technologies.In the first section, we introduce the field of data justice, from its early discussions to more recent proposals to relocate understandings of what data justice means. This section includes a description of the six pillars of data justice around which this guidance revolves. Next, to support developers in designing, developing, and deploying responsible and equitable data-intensive and AI/ML systems, we outline the AI/ML project lifecycle through a sociotechnical lens. To support the operationalisation data justice throughout the entirety of the AI/ML lifecycle and within data innovation ecosystems, we then present five overarching principles of responsible, equitable, and trustworthy data research and innovation practices, the SAFE-D principles-Safety, Accountability, Fairness, Explainability, and Data Quality, Integrity, Protection, and Privacy. The final section presents guiding questions that will help developers both address data justice issues throughout the AI/ML lifecycle and engage in reflective innovation practices that ensure the design, development, and deployment of responsible and equitable data-intensive and AI/ML systems.