LG NE OCOct 15, 2024

Evolutionary Retrofitting

Mathurin Videau, Mariia Zameshina, Alessandro Leite, Laurent Najman, Marc Schoenauer, Olivier Teytaud

arXiv:2410.11330v24.63 citationsh-index: 5ACM Trans Evol Learn Optim

Originality Incremental advance

AI Analysis

This method addresses the challenge of incorporating non-differentiable or human feedback into model refinement for practitioners in fields like computer vision and NLP, though it is incremental as it builds on existing evolutionary optimization techniques.

The paper tackles the problem of refining fully trained machine learning models using evolutionary optimization to optimize parameters with respect to non-differentiable error signals, such as user feedback or specific metrics, and demonstrates efficiency across various applications like speech re-synthesis and image generation, requiring only a few dozen to a few hundred feedback points.

AfterLearnER (After Learning Evolutionary Retrofitting) consists in applying evolutionary optimization to refine fully trained machine learning models by optimizing a set of carefully chosen parameters or hyperparameters of the model, with respect to some actual, exact, and hence possibly non-differentiable error signal, performed on a subset of the standard validation set. The efficiency of AfterLearnER is demonstrated by tackling non-differentiable signals such as threshold-based criteria in depth sensing, the word error rate in speech re-synthesis, the number of kills per life at Doom, computational accuracy or BLEU in code translation, image quality in 3D generative adversarial networks (GANs), and user feedback in image generation via Latent Diffusion Models (LDM). This retrofitting can be done after training, or dynamically at inference time by taking into account the user feedback. The advantages of AfterLearnER are its versatility, the possibility to use non-differentiable feedback, including human evaluations (i.e., no gradient is needed), the limited overfitting supported by a theoretical study, and its anytime behavior. Last but not least, AfterLearnER requires only a small amount of feedback, i.e., a few dozen to a few hundred scalars, compared to the tens of thousands needed in most related published works.

View on arXiv PDF

Similar