Adaptive Thresholding for Multi-Label Classification via Global-Local Signal Fusion
This work addresses the problem of accurate multi-label classification for applications like text categorization, though it appears incremental as it builds on thresholding techniques with novel fusion and loss integration.
The paper tackles multi-label classification under class imbalance and noise by introducing an adaptive thresholding mechanism that fuses global and local signals, achieving a macro-F1 of 0.1712 on the AmazonCat-13K benchmark and outperforming existing methods.
Multi-label classification (MLC) requires predicting multiple labels per sample, often under heavy class imbalance and noisy conditions. Traditional approaches apply fixed thresholds or treat labels independently, overlooking context and global rarity. We introduce an adaptive thresholding mechanism that fuses global (IDF-based) and local (KNN-based) signals to produce per-label, per-instance thresholds. Instead of applying these as hard cutoffs, we treat them as differentiable penalties in the loss, providing smooth supervision and better calibration. Our architecture is lightweight, interpretable, and highly modular. On the AmazonCat-13K benchmark, it achieves a macro-F1 of 0.1712, substantially outperforming tree-based and pretrained transformer-based methods. We release full code for reproducibility and future extensions.