Revisiting Global Pooling through the Lens of Optimal Transport
This work addresses the lack of theoretical grounding in global pooling for machine learning practitioners, offering a unified and learnable framework that can enhance model performance in various domains.
The authors tackled the empirical nature of global pooling in machine learning by developing a novel framework based on optimal transport, unifying existing methods and proposing UOT-Pooling layers that improved performance in tasks like multi-instance learning, graph classification, and image classification.
Global pooling is one of the most significant operations in many machine learning models and tasks, whose implementation, however, is often empirical in practice. In this study, we develop a novel and solid global pooling framework through the lens of optimal transport. We demonstrate that most existing global pooling methods are equivalent to solving some specializations of an unbalanced optimal transport (UOT) problem. Making the parameters of the UOT problem learnable, we unify various global pooling methods in the same framework, and accordingly, propose a generalized global pooling layer called UOT-Pooling (UOTP) for neural networks. Besides implementing the UOTP layer based on the classic Sinkhorn-scaling algorithm, we design a new model architecture based on the Bregman ADMM algorithm, which has better numerical stability and can reproduce existing pooling layers more effectively. We test our UOTP layers in several application scenarios, including multi-instance learning, graph classification, and image classification. Our UOTP layers can either imitate conventional global pooling layers or learn some new pooling mechanisms leading to better performance.