LGCVNov 13, 2025

Intrinsic Dimensionality as a Model-Free Measure of Class Imbalance

arXiv:2511.10475v1h-index: 20Has Code
Originality Incremental advance
AI Analysis

This provides a more accurate and efficient way to measure imbalance for researchers and practitioners in machine learning, though it is incremental as it builds on existing imbalance mitigation methods.

The paper tackles the problem of quantifying class imbalance in classification tasks by proposing Intrinsic Dimensionality (ID) as a model-free measure, showing that it consistently outperforms traditional cardinality-based methods across five datasets with diverse imbalance ratios.

Imbalance in classification tasks is commonly quantified by the cardinalities of examples across classes. This, however, disregards the presence of redundant examples and inherent differences in the learning difficulties of classes. Alternatively, one can use complex measures such as training loss and uncertainty, which, however, depend on training a machine learning model. Our paper proposes using data Intrinsic Dimensionality (ID) as an easy-to-compute, model-free measure of imbalance that can be seamlessly incorporated into various imbalance mitigation methods. Our results across five different datasets with a diverse range of imbalance ratios show that ID consistently outperforms cardinality-based re-weighting and re-sampling techniques used in the literature. Moreover, we show that combining ID with cardinality can further improve performance. Code: https://github.com/cagries/IDIM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes