CYAISep 10, 2020

On the Fairness of 'Fake' Data in Legal AI

arXiv:2009.04640v2
AI Analysis

This addresses fairness concerns in legal AI systems for courts and legal professionals, but is largely conceptual without empirical validation.

The paper examines how pre-processing methods to correct biased training data in legal AI can create 'fake' data that distorts cases, potentially undermining legal precedent and transparency. It recommends alternative fairness approaches that modify classifiers or correct outputs instead of altering input data.

The economics of smaller budgets and larger case numbers necessitates the use of AI in legal proceedings. We examine the concept of disparate impact and how biases in the training data lead to the search for fairer AI. This paper seeks to begin the discourse on what such an implementation would actually look like with a criticism of pre-processing methods in a legal context . We outline how pre-processing is used to correct biased data and then examine the legal implications of effectively changing cases in order to achieve a fairer outcome including the black box problem and the slow encroachment on legal precedent. Finally we present recommendations on how to avoid the pitfalls of pre-processed data with methods that either modify the classifier or correct the output in the final step.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes