MMbeddings: Parameter-Efficient, Low-Overfitting Probabilistic Embeddings Inspired by Nonlinear Mixed Models
This addresses parameter efficiency for machine learning practitioners dealing with high-cardinality categorical data, though it appears incremental as it builds on existing variational autoencoder frameworks.
The paper tackles the problem of overfitting and computational burden in high-cardinality categorical embeddings by proposing MMbeddings, a probabilistic approach that reduces parameters from cardinality × dimension to a cardinality-independent number, resulting in consistent performance improvements across collaborative filtering and tabular regression tasks.
We present MMbeddings, a probabilistic embedding approach that reinterprets categorical embeddings through the lens of nonlinear mixed models, effectively bridging classical statistical theory with modern deep learning. By treating embeddings as latent random effects within a variational autoencoder framework, our method substantially decreases the number of parameters -- from the conventional embedding approach of cardinality $\times$ embedding dimension, which quickly becomes infeasible with large cardinalities, to a significantly smaller, cardinality-independent number determined primarily by the encoder architecture. This reduction dramatically mitigates overfitting and computational burden in high-cardinality settings. Extensive experiments on simulated and real datasets, encompassing collaborative filtering and tabular regression tasks using varied architectures, demonstrate that MMbeddings consistently outperforms traditional embeddings, underscoring its potential across diverse machine learning applications.