Learning Sample-Specific Models with Low-Rank Personalized Regression
This addresses the need for individualized policies and treatment plans in applications such as precision medicine and advertising, though it is incremental as it builds on existing personalized modeling concepts.
The paper tackles the problem of heterogeneous datasets with overlapping latent subpopulations by proposing sample-specific models for individualized inference and prediction, showing that these models provide fine-grained interpretations without sacrificing predictive accuracy compared to state-of-the-art models like deep neural networks.
Modern applications of machine learning (ML) deal with increasingly heterogeneous datasets comprised of data collected from overlapping latent subpopulations. As a result, traditional models trained over large datasets may fail to recognize highly predictive localized effects in favour of weakly predictive global patterns. This is a problem because localized effects are critical to developing individualized policies and treatment plans in applications ranging from precision medicine to advertising. To address this challenge, we propose to estimate sample-specific models that tailor inference and prediction at the individual level. In contrast to classical ML models that estimate a single, complex model (or only a few complex models), our approach produces a model personalized to each sample. These sample-specific models can be studied to understand subgroup dynamics that go beyond coarse-grained class labels. Crucially, our approach does not assume that relationships between samples (e.g. a similarity network) are known a priori. Instead, we use unmodeled covariates to learn a latent distance metric over the samples. We apply this approach to financial, biomedical, and electoral data as well as simulated data and show that sample-specific models provide fine-grained interpretations of complicated phenomena without sacrificing predictive accuracy compared to state-of-the-art models such as deep neural networks.