Trust in Data Science: Collaboration, Translation, and Accountability in Corporate Data Science Projects
This addresses the problem of building trustworthy data science systems in real-world corporate settings, offering insights for practitioners and researchers, though it is incremental in building on existing CSCW and critical data studies.
The study examined how trust is established in corporate data science projects by identifying four tensions in applied work and showing that trust depends on collaborative practices like negotiation and translation, not just technical processes.
The trustworthiness of data science systems in applied and real-world settings emerges from the resolution of specific tensions through situated, pragmatic, and ongoing forms of work. Drawing on research in CSCW, critical data studies, and history and sociology of science, and six months of immersive ethnographic fieldwork with a corporate data science team, we describe four common tensions in applied data science work: (un)equivocal numbers, (counter)intuitive knowledge, (in)credible data, and (in)scrutable models. We show how organizational actors establish and re-negotiate trust under messy and uncertain analytic conditions through practices of skepticism, assessment, and credibility. Highlighting the collaborative and heterogeneous nature of real-world data science, we show how the management of trust in applied corporate data science settings depends not only on pre-processing and quantification, but also on negotiation and translation. We conclude by discussing the implications of our findings for data science research and practice, both within and beyond CSCW.