ROAIJul 22, 2025

Benchmarking LLM Privacy Recognition for Social Robot Decision Making

arXiv:2507.16124v37 citationsh-index: 23
Originality Incremental advance
AI Analysis

This work addresses privacy risks for users of household robots by evaluating and enhancing LLM privacy recognition, though it is incremental as it builds on existing frameworks and models.

The study tackled the problem of privacy awareness in large language models (LLMs) used for social robot decision-making by benchmarking their performance against human preferences in household scenarios, finding low agreement between humans and LLMs. It implemented prompting strategies to improve LLM capabilities as privacy controllers, discussing implications for AI privacy in human-robot interaction.

While robots have previously utilized rule-based systems or probabilistic models for user interaction, the rapid evolution of large language models (LLMs) presents new opportunities to develop LLM-powered robots for enhanced human-robot interaction (HRI). To fully realize these capabilities, however, robots need to collect data such as audio, fine-grained images, video, and locations. As a result, LLMs often process sensitive personal information, particularly within private environments, such as homes. Given the tension between utility and privacy risks, evaluating how current LLMs manage sensitive data is critical. Specifically, we aim to explore the extent to which out-of-the-box LLMs are privacy-aware in the context of household robots. In this work, we present a set of privacy-relevant scenarios developed using the Contextual Integrity (CI) framework. We first surveyed users' privacy preferences regarding in-home robot behaviors and then examined how their privacy orientations affected their choices of these behaviors (N = 450). We then provided the same set of scenarios and questions to state-of-the-art LLMs (N = 10) and found that the agreement between humans and LLMs was generally low. To further investigate the capabilities of LLMs as potential privacy controllers, we implemented four additional prompting strategies and compared their results. We discuss the performance of the evaluated models as well as the implications and potential of AI privacy awareness in human-robot interaction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes