Tighter Risk Bounds for Mixtures of Experts
This work addresses the need for improved privacy and generalization guarantees in machine learning models, specifically for mixtures of experts, but it appears incremental as it builds on existing bounds with a modified gating mechanism.
The paper tackles the problem of providing theoretical risk bounds for mixtures of experts by imposing local differential privacy on the gating mechanism, resulting in bounds that are significantly tighter than existing ones under reasonable conditions, with experimental validation showing enhanced generalization ability.
In this work, we provide upper bounds on the risk of mixtures of experts by imposing local differential privacy (LDP) on their gating mechanism. These theoretical guarantees are tailored to mixtures of experts that utilize the one-out-of-$n$ gating mechanism, as opposed to the conventional $n$-out-of-$n$ mechanism. The bounds exhibit logarithmic dependence on the number of experts, and encapsulate the dependence on the gating mechanism in the LDP parameter, making them significantly tighter than existing bounds, under reasonable conditions. Experimental results support our theory, demonstrating that our approach enhances the generalization ability of mixtures of experts and validating the feasibility of imposing LDP on the gating mechanism.