DBCYLGApr 10, 2024

FairEM360: A Suite for Responsible Entity Matching

arXiv:2404.07354v21 citationsh-index: 21Proc VLDB Endow
Originality Synthesis-oriented
AI Analysis

This addresses fairness issues in entity matching for data pipeline practitioners, but it is incremental as it builds on existing fairness measures and ensemble methods.

The paper tackles the problem of unintentional biases in entity matching, which can affect downstream data quality, by introducing FairEM360, a framework that audits fairness, explains unfairness, and provides resolutions through human-in-the-loop feedback.

Entity matching is one the earliest tasks that occur in the big data pipeline and is alarmingly exposed to unintentional biases that affect the quality of data. Identifying and mitigating the biases that exist in the data or are introduced by the matcher at this stage can contribute to promoting fairness in downstream tasks. This demonstration showcases FairEM360, a framework for 1) auditing the output of entity matchers across a wide range of fairness measures and paradigms, 2) providing potential explanations for the underlying reasons for unfairness, and 3) providing resolutions for the unfairness issues through an exploratory process with human-in-the-loop feedback, utilizing an ensemble of matchers. We aspire for FairEM360 to contribute to the prioritization of fairness as a key consideration in the evaluation of EM pipelines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes