LGApr 30

Differential Subgroup Discovery: Characterizing Where Two Populations Differ, and Why

arXiv:2604.2774111.8
Predicted impact top 90% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners in clinical analysis, model diagnostics, and treatment-effect studies, this work provides a principled method to pinpoint covariate combinations driving population-level gaps, with causal interpretability under certain conditions.

The paper introduces a formal framework for differential subgroup discovery—identifying subsets where two populations differ most in a target outcome—and proposes DiffSub, a gradient-based method that finds interpretable subgroups. Across synthetic and real-world benchmarks (medical, model-error, treatment-effect), DiffSub effectively reveals where and why population differences occur.

We study the problem of understanding where two populations differ within a feature space, which we formalize in the concept of a differential subgroup: a subset of individuals from both populations who, despite sharing similar characteristics, exhibit exceptional differences in a target outcome. Differential subgroups reveal the regions of the feature space where population-level gaps are most pronounced and can help practitioners identify the covariate combinations that are structurally responsible for these differences, e.g.~in clinical analysis, model diagnostics, or treatment-effect studies. We introduce a general optimization objective for discovering differential subgroups and establish conditions under which the resulting subgroups admit a causal interpretation of population differences. We propose DiffSub, a gradient-based approach that discovers interpretable differential subgroups in tabular data. Across synthetic benchmarks, medical case studies, model-error analyses, and treatment-effect settings, DiffSub identifies informative subgroups that reveal where population differences arise and why.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes