LGAIFeb 2

VLM-Guided Experience Replay

arXiv:2602.01915v1h-index: 16
Originality Incremental advance
AI Analysis

This addresses sample efficiency and performance issues in reinforcement learning for domains like gaming and robotics, representing an incremental advance by applying existing VLMs to an unexplored component.

The paper tackled the problem of improving sample efficiency and performance in reinforcement learning by using Vision-Language Models to prioritize experiences in replay buffers, resulting in 11-52% higher success rates and 19-45% better sample efficiency across various domains.

Recent advances in Large Language Models (LLMs) and Vision-Language Models (VLMs) have enabled powerful semantic and multimodal reasoning capabilities, creating new opportunities to enhance sample efficiency, high-level planning, and interpretability in reinforcement learning (RL). While prior work has integrated LLMs and VLMs into various components of RL, the replay buffer, a core component for storing and reusing experiences, remains unexplored. We propose addressing this gap by leveraging VLMs to guide the prioritization of experiences in the replay buffer. Our key idea is to use a frozen, pre-trained VLM (requiring no fine-tuning) as an automated evaluator to identify and prioritize promising sub-trajectories from the agent's experiences. Across scenarios, including game-playing and robotics, spanning both discrete and continuous domains, agents trained with our proposed prioritization method achieve 11-52% higher average success rates and improve sample efficiency by 19-45% compared to previous approaches. https://esharony.me/projects/vlm-rb/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes