LGMay 20
PEARL: Unbiased Percentile Estimation via Contrastive Learning for Industrial-Scale Livestream RecommendationBlake Gella, Wei Wu, Yuhao Yin et al.
Recommender systems trained on user interaction data are susceptible to behavioral intensity imbalance--a systematic distortion arising from heterogeneous engagement patterns across users. This imbalance skews feedback signals such that observed interactions no longer faithfully reflect true preferences, causing models to disproportionately amplify signals from highly active users while underrepresenting others, which ultimately degrades recommendation quality and robustness at scale. To address this issue, we propose a nonparametric contrastive percentile approximation framework, PEARL, that models relative preference signals instead of absolute engagement magnitudes. Building upon relative advantage debiasing, PEARL leverages real contrastive interaction samples to approximate percentile relationships directly, without relying on auxiliary distribution estimation models. We provide theoretical justification demonstrating that such pairwise comparisons yield unbiased estimates of percentile-based preference signals. For broader applicability, we introduce a prediction-based bootstrapping mechanism for percentile smoothing to handle sparse and discrete feedback, alongside a generalized value-weighted formulation and a co-training strategy to enhance both modeling flexibility and representation learning. Extensive offline experiments demonstrate that PEARL effectively mitigates behavioral bias and consistently improves recommendation performance across multiple ranking targets. Deployed in a production livestream platform with a combined user base of billions, online A/B testing confirms substantial real-world gains: +2.10% Watch Duration, +0.80% Consumption Amount, +1.49% Interaction Rate, and -6.91% Report Rate.
LGDec 1, 2024
Leveraging Intermediate Neural Collapse with Simplex ETFs for Efficient Deep Neural NetworksEmily Liu
Neural collapse is a phenomenon observed during the terminal phase of neural network training, characterized by the convergence of network activations, class means, and linear classifier weights to a simplex equiangular tight frame (ETF), a configuration of vectors that maximizes mutual distance within a subspace. This phenomenon has been linked to improved interpretability, robustness, and generalization in neural networks. However, its potential to guide neural network training and regularization remains underexplored. Previous research has demonstrated that constraining the final layer of a neural network to a simplex ETF can reduce the number of trainable parameters without sacrificing model accuracy. Furthermore, deep fully connected networks exhibit neural collapse not only in the final layer but across all layers beyond a specific effective depth. Using these insights, we propose two novel training approaches: Adaptive-ETF, a generalized framework that enforces simplex ETF constraints on all layers beyond the effective depth, and ETF-Transformer, which applies simplex ETF constraints to the feedforward layers within transformer blocks. We show that these approaches achieve training and testing performance comparable to those of their baseline counterparts while significantly reducing the number of learnable parameters.
LGAug 14, 2025
Relative Advantage Debiasing for Watch-Time Prediction in Short-Video RecommendationEmily Liu, Kuan Han, Minfeng Zhan et al.
Watch time is widely used as a proxy for user satisfaction in video recommendation platforms. However, raw watch times are influenced by confounding factors such as video duration, popularity, and individual user behaviors, potentially distorting preference signals and resulting in biased recommendation models. We propose a novel relative advantage debiasing framework that corrects watch time by comparing it to empirically derived reference distributions conditioned on user and item groups. This approach yields a quantile-based preference signal and introduces a two-stage architecture that explicitly separates distribution estimation from preference learning. Additionally, we present distributional embeddings to efficiently parameterize watch-time quantiles without requiring online sampling or storage of historical data. Both offline and online experiments demonstrate significant improvements in recommendation accuracy and robustness compared to existing baseline methods.
RONov 28, 2024
Semi-Supervised Neural Processes for Articulated Object InteractionsEmily Liu, Michael Noseworthy, Nicholas Roy
The scarcity of labeled action data poses a considerable challenge for developing machine learning algorithms for robotic object manipulation. It is expensive and often infeasible for a robot to interact with many objects. Conversely, visual data of objects, without interaction, is abundantly available and can be leveraged for pretraining and feature extraction. However, current methods that rely on image data for pretraining do not easily adapt to task-specific predictions, since the learned features are not guaranteed to be relevant. This paper introduces the Semi-Supervised Neural Process (SSNP): an adaptive reward-prediction model designed for scenarios in which only a small subset of objects have labeled interaction data. In addition to predicting reward labels, the latent-space of the SSNP is jointly trained with an autoencoding objective using passive data from a much larger set of objects. Jointly training with both types of data allows the model to focus more effectively on generalizable features and minimizes the need for extensive retraining, thereby reducing computational demands. The efficacy of SSNP is demonstrated through a door-opening task, leading to better performance than other semi-supervised methods, and only using a fraction of the data compared to other adaptive models.
LGMay 26, 2023
Structured Latent Variable Models for Articulated Object InteractionEmily Liu, Michael Noseworthy, Nicholas Roy
In this paper, we investigate a scenario in which a robot learns a low-dimensional representation of a door given a video of the door opening or closing. This representation can be used to infer door-related parameters and predict the outcomes of interacting with the door. Current machine learning based approaches in the doors domain are based primarily on labelled datasets. However, the large quantity of available door data suggests the feasibility of a semisupervised approach based on pretraining. To exploit the hierarchical structure of the dataset where each door has multiple associated images, we pretrain with a structured latent variable model known as a neural statistician. The neural satsitician enforces separation between shared context-level variables (common across all images associated with the same door) and instance-level variables (unique to each individual image). We first demonstrate that the neural statistician is able to learn an embedding that enables reconstruction and sampling of realistic door images. Then, we evaluate the correspondence of the learned embeddings to human-interpretable parameters in a series of supervised inference tasks. It was found that a pretrained neural statistician encoder outperformed analogous context-free baselines when predicting door handedness, size, angle location, and configuration from door images. Finally, in a visual bandit door-opening task with a variety of door configuration, we found that neural statistician embeddings achieve lower regret than context-free baselines.