LG AI CV CYMay 15, 2024

Aggregate Representation Measure for Predictive Model Reusability

Vishwesh Sangarya, Richard Bradford, Jung-Eun Kim

arXiv:2405.09600v16.42 citationsh-index: 16

Originality Incremental advance

AI Analysis

This work addresses the challenge of reducing computational and environmental costs for practitioners reusing models in shifting data environments, though it is incremental as it builds on existing representation-based methods.

The paper tackles the problem of estimating retraining costs for machine learning models under distribution shifts by proposing the Aggregated Representation Measure (ARM), which quantifies changes in model representations to predict resources like epochs, energy, and carbon emissions, enabling more cost-effective and sustainable model reuse.

In this paper, we propose a predictive quantifier to estimate the retraining cost of a trained model in distribution shifts. The proposed Aggregated Representation Measure (ARM) quantifies the change in the model's representation from the old to new data distribution. It provides, before actually retraining the model, a single concise index of resources - epochs, energy, and carbon emissions - required for the retraining. This enables reuse of a model with a much lower cost than training a new model from scratch. The experimental results indicate that ARM reasonably predicts retraining costs for varying noise intensities and enables comparisons among multiple model architectures to determine the most cost-effective and sustainable option.

View on arXiv PDF

Similar