CLJun 4, 2025

Act-as-Pet: Benchmarking the Abilities of Large Language Models as E-Pets in Social Network Services

arXiv:2506.03761v13 citationsh-index: 16CIKM
Originality Incremental advance
AI Analysis

This addresses the need for systematic benchmarking in emotionally rich LLM applications like virtual pets, though it is incremental as it builds on existing role-playing approaches.

The paper tackles the problem of benchmarking Large Language Models (LLMs) for virtual pet companionship by introducing Pet-Bench, a dedicated benchmark with over 7,500 interaction instances, and finds significant performance variations across 28 LLMs linked to model size and capabilities.

As interest in using Large Language Models (LLMs) for interactive and emotionally rich experiences grows, virtual pet companionship emerges as a novel yet underexplored application. Existing approaches focus on basic pet role-playing interactions without systematically benchmarking LLMs for comprehensive companionship. In this paper, we introduce Pet-Bench, a dedicated benchmark that evaluates LLMs across both self-interaction and human-interaction dimensions. Unlike prior work, Pet-Bench emphasizes self-evolution and developmental behaviors alongside interactive engagement, offering a more realistic reflection of pet companionship. It features diverse tasks such as intelligent scheduling, memory-based dialogues, and psychological conversations, with over 7,500 interaction instances designed to simulate complex pet behaviors. Evaluation of 28 LLMs reveals significant performance variations linked to model size and inherent capabilities, underscoring the need for specialized optimization in this domain. Pet-Bench serves as a foundational resource for benchmarking pet-related LLM abilities and advancing emotionally immersive human-pet interactions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes