LGDBSep 24, 2024

Evaluating Blocking Biases in Entity Matching

arXiv:2409.16410v14 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses fairness issues in entity matching for data integration tasks, which is an incremental extension of existing blocking methods.

The paper tackles the problem of fairness in entity matching blocking techniques, which can inadvertently favor certain demographic groups, by extending traditional blocking metrics to incorporate fairness and providing a framework for assessing bias. The experimental analysis evaluates the effectiveness and fairness of various blocking methods, highlighting the importance of considering fairness to ensure equitable outcomes in data integration.

Entity Matching (EM) is crucial for identifying equivalent data entities across different sources, a task that becomes increasingly challenging with the growth and heterogeneity of data. Blocking techniques, which reduce the computational complexity of EM, play a vital role in making this process scalable. Despite advancements in blocking methods, the issue of fairness; where blocking may inadvertently favor certain demographic groups; has been largely overlooked. This study extends traditional blocking metrics to incorporate fairness, providing a framework for assessing bias in blocking techniques. Through experimental analysis, we evaluate the effectiveness and fairness of various blocking methods, offering insights into their potential biases. Our findings highlight the importance of considering fairness in EM, particularly in the blocking phase, to ensure equitable outcomes in data integration tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes