Let the Experts Speak: Improving Survival Prediction & Calibration via Mixture-of-Experts Heads
This work addresses survival prediction in healthcare by enhancing model performance without sacrificing clustering, though it appears incremental as it builds on existing mixture-of-experts methods.
The paper tackled the trade-off between patient clustering and predictive performance in survival analysis by introducing a discrete-time deep mixture-of-experts architecture that improves calibration and accuracy while maintaining clustering, with results showing that more expressive experts outperform fixed prototypes.
Deep mixture-of-experts models have attracted a lot of attention for survival analysis problems, particularly for their ability to cluster similar patients together. In practice, grouping often comes at the expense of key metrics such calibration error and predictive accuracy. This is due to the restrictive inductive bias that mixture-of-experts imposes, that predictions for individual patients must look like predictions for the group they're assigned to. Might we be able to discover patient group structure, where it exists, while improving calibration and predictive accuracy? In this work, we introduce several discrete-time deep mixture-of-experts (MoE) based architectures for survival analysis problems, one of which achieves all desiderata: clustering, calibration, and predictive accuracy. We show that a key differentiator between this array of MoEs is how expressive their experts are. We find that more expressive experts that tailor predictions per patient outperform experts that rely on fixed group prototypes.