Causal Covariate Shift Correction using Fisher information penalty
This addresses unreliable model selection and assessment in distributed training settings, but it is incremental as it builds on existing covariate shift correction methods.
The paper tackles the problem of evolving feature densities across training batches biasing cross-validation, by proposing Causal Covariate Shift Correction (C³), which uses Fisher Information to penalize loss in subsequent batches, resulting in accuracy improvements of up to 20.3% in batchwise benchmarks.
Evolving feature densities across batches of training data bias cross-validation, making model selection and assessment unreliable (\cite{sugiyama2012machine}). This work takes a distributed density estimation angle to the training setting where data are temporally distributed. \textit{Causal Covariate Shift Correction ($C^{3}$)}, accumulates knowledge about the data density of a training batch using Fisher Information, and using it to penalize the loss in all subsequent batches. The penalty improves accuracy by $12.9\%$ over the full-dataset baseline, by $20.3\%$ accuracy at maximum in batchwise and $5.9\%$ at minimum in foldwise benchmarks.