CVAIJul 22, 2025

HIPPO-Video: Simulating Watch Histories with Large Language Models for Personalized Video Highlighting

arXiv:2507.16873v1
Originality Incremental advance
AI Analysis

This addresses personalized video highlighting for users with diverse preferences, though it is incremental as it builds on existing methods with a new dataset.

The authors tackled the lack of personalization in video highlighting by creating HIPPO-Video, a dataset with 2,040 watch history pairs generated via an LLM-based simulator, and showed that their method HiPHer outperforms existing approaches.

The exponential growth of video content has made personalized video highlighting an essential task, as user preferences are highly variable and complex. Existing video datasets, however, often lack personalization, relying on isolated videos or simple text queries that fail to capture the intricacies of user behavior. In this work, we introduce HIPPO-Video, a novel dataset for personalized video highlighting, created using an LLM-based user simulator to generate realistic watch histories reflecting diverse user preferences. The dataset includes 2,040 (watch history, saliency score) pairs, covering 20,400 videos across 170 semantic categories. To validate our dataset, we propose HiPHer, a method that leverages these personalized watch histories to predict preference-conditioned segment-wise saliency scores. Through extensive experiments, we demonstrate that our method outperforms existing generic and query-based approaches, showcasing its potential for highly user-centric video highlighting in real-world scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes