CLJun 2, 2025

Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and Analysis

Jisoo Mok, Ik-hwan Kim, Sangkwon Park, Sungroh Yoon

arXiv:2506.01262v116.311 citationsh-index: 9Has CodeACL

Originality Synthesis-oriented

AI Analysis

This work addresses a research gap for developers and researchers in AI by providing tools to evaluate personalized LLM assistants, though it is incremental as it builds on existing methods for dataset creation and evaluation.

The authors tackled the lack of an open-source conversational dataset for personalized AI assistants by introducing HiCUPID, a benchmark that includes a dataset and an automated evaluation model based on Llama-3.2, which closely aligns with human preferences.

Personalized AI assistants, a hallmark of the human-like capabilities of Large Language Models (LLMs), are a challenging application that intertwines multiple problems in LLM research. Despite the growing interest in the development of personalized assistants, the lack of an open-source conversational dataset tailored for personalization remains a significant obstacle for researchers in the field. To address this research gap, we introduce HiCUPID, a new benchmark to probe and unleash the potential of LLMs to deliver personalized responses. Alongside a conversational dataset, HiCUPID provides a Llama-3.2-based automated evaluation model whose assessment closely mirrors human preferences. We release our dataset, evaluation model, and code at https://github.com/12kimih/HiCUPID.

View on arXiv PDF Code

Similar