SuperCone: Unified User Segmentation over Heterogeneous Experts via Concept Meta-learning
This addresses data scarcity and heterogeneity in user segmentation for recommendation systems, though it appears incremental as it builds on existing meta-learning and ensemble methods.
The paper tackles the problem of user segmentation across heterogeneous prediction tasks with limited data by introducing SuperCone, a system that uses concept meta-learning to combine diverse expert models adaptively. Experiments show it significantly outperforms state-of-the-art recommendation and ranking algorithms on various predictive segment tasks and benchmarks.
We study the problem of user segmentation: given a set of users and one or more predefined groups or segments, assign users to their corresponding segments. As an example, for a segment indicating particular interest in a certain area of sports or entertainment, the task will be to predict whether each single user will belong to the segment. However, there may exist numerous long tail prediction tasks that suffer from data availability and may be of heterogeneous nature, which make it hard to capture using single off the shelf model architectures. In this work, we present SuperCone, our unified predicative segments system that addresses the above challenges. It builds on top of a flat concept representation that summarizes each user's heterogeneous digital footprints, and uniformly models each of the prediction task using an approach called "super learning ", that is, combining prediction models with diverse architectures or learning method that are not compatible with each other. Following this, we provide an end to end approach that learns to flexibly attend to best suited heterogeneous experts adaptively, while at the same time incorporating deep representations of the input concepts that augments the above experts. Experiments show that SuperCone significantly outperform state-of-the-art recommendation and ranking algorithms on a wide range of predicative segment tasks and public structured data learning benchmarks.