One-vs-All Models for Asynchronous Training: An Empirical Analysis
This work addresses scalable model training for industrial classification systems with frequent updates, but it is incremental as it builds on existing OVA methods.
The paper tackles the problem of asynchronous training in One-vs-All classification systems by empirically analyzing how independent updates affect accuracy, finding that a proposed metric strongly correlates with model performance in Natural Language Understanding and Spoken Language Understanding tasks.
Any given classification problem can be modeled using multi-class or One-vs-All (OVA) architecture. An OVA system consists of as many OVA models as the number of classes, providing the advantage of asynchrony, where each OVA model can be re-trained independent of other models. This is particularly advantageous in settings where scalable model training is a consideration (for instance in an industrial environment where multiple and frequent updates need to be made to the classification system). In this paper, we conduct empirical analysis on realizing independent updates to OVA models and its impact on the accuracy of the overall OVA system. Given that asynchronous updates lead to differences in training datasets for OVA models, we first define a metric to quantify the differences in datasets. Thereafter, using Natural Language Understanding as a task of interest, we estimate the impact of three factors: (i) number of classes, (ii) number of data points and, (iii) divergences in training datasets across OVA models; on the OVA system accuracy. Finally, we observe the accuracy impact of increased asynchrony in a Spoken Language Understanding system. We analyze the results and establish that the proposed metric correlates strongly with the model performances in both the experimental settings.