Wasserstein Barycenter Model Ensembling
This addresses the problem of improving model ensembling for researchers and practitioners in machine learning by leveraging semantic information, though it is incremental as it builds on existing optimal transport methods.
The paper tackles model ensembling in multiclass and multilabel learning by using Wasserstein barycenters to incorporate semantic side information like word embeddings, balancing confidence and semantics to find consensus between models. Results show it is a viable alternative to basic geometric or arithmetic mean ensembling in applications such as attribute-based classification, multilabel learning, and image captioning generation.
In this paper we propose to perform model ensembling in a multiclass or a multilabel learning setting using Wasserstein (W.) barycenters. Optimal transport metrics, such as the Wasserstein distance, allow incorporating semantic side information such as word embeddings. Using W. barycenters to find the consensus between models allows us to balance confidence and semantics in finding the agreement between the models. We show applications of Wasserstein ensembling in attribute-based classification, multilabel learning and image captioning generation. These results show that the W. ensembling is a viable alternative to the basic geometric or arithmetic mean ensembling.