Benedetta Lavinia Mussati

h-index28
1paper

1 Paper

LGAug 5, 2025
Prediction-Oriented Subsampling from Data Streams

Benedetta Lavinia Mussati, Freddie Bickford Smith, Tom Rainforth et al. · oxford

Data is often generated in streams, with new observations arriving over time. A key challenge for learning models from data streams is capturing relevant information while keeping computational costs manageable. We explore intelligent data subsampling for offline learning, and argue for an information-theoretic method centred on reducing uncertainty in downstream predictions of interest. Empirically, we demonstrate that this prediction-oriented approach performs better than a previously proposed information-theoretic technique on two widely studied problems. At the same time, we highlight that reliably achieving strong performance in practice requires careful model design.