CLJun 4, 2025

Act-as-Pet: Benchmarking the Abilities of Large Language Models as E-Pets in Social Network Services

Hongcheng Guo, Zheyong Xie, Shaosheng Cao, Boyang Wang, Weiting Liu, Zheyu Ye, Zhoujun Li, Zuozhu Liu

arXiv:2506.03761v14.93 citationsh-index: 16CIKM

Originality Incremental advance

AI Analysis

This addresses the need for systematic benchmarking in emotionally rich LLM applications like virtual pets, though it is incremental as it builds on existing role-playing approaches.

The paper tackles the problem of benchmarking Large Language Models (LLMs) for virtual pet companionship by introducing Pet-Bench, a dedicated benchmark with over 7,500 interaction instances, and finds significant performance variations across 28 LLMs linked to model size and capabilities.

As interest in using Large Language Models (LLMs) for interactive and emotionally rich experiences grows, virtual pet companionship emerges as a novel yet underexplored application. Existing approaches focus on basic pet role-playing interactions without systematically benchmarking LLMs for comprehensive companionship. In this paper, we introduce Pet-Bench, a dedicated benchmark that evaluates LLMs across both self-interaction and human-interaction dimensions. Unlike prior work, Pet-Bench emphasizes self-evolution and developmental behaviors alongside interactive engagement, offering a more realistic reflection of pet companionship. It features diverse tasks such as intelligent scheduling, memory-based dialogues, and psychological conversations, with over 7,500 interaction instances designed to simulate complex pet behaviors. Evaluation of 28 LLMs reveals significant performance variations linked to model size and inherent capabilities, underscoring the need for specialized optimization in this domain. Pet-Bench serves as a foundational resource for benchmarking pet-related LLM abilities and advancing emotionally immersive human-pet interactions.

View on arXiv PDF

Similar