ROJun 7, 2022
The Road to a Successful HRI: AI, Trust and ethicS-TRAITSAlessandra Rossi, Antonio Andriella, Silvia Rossi et al.
The aim of this workshop is to foster the exchange of insights on past and ongoing research towards effective and long-lasting collaborations between humans and robots. This workshop will provide a forum for representatives from academia and industry communities to analyse the different aspects of HRI that impact on its success. We particularly focus on AI techniques required to implement autonomous and proactive interactions, on the factors that enhance, undermine, or recover humans' acceptance and trust in robots, and on the potential ethical and legal concerns related to the deployment of such robots in human-centred environments. Website: https://sites.google.com/view/traits-hri-2022
ROJul 10, 2023
Proceeding of the 1st Workshop on Social Robots Personalisation At the crossroads between engineering and humanities (CONCATENATE)Imene Tarakli, Georgios Angelopoulos, Mehdi Hellou et al.
Nowadays, robots are expected to interact more physically, cognitively, and socially with people. They should adapt to unpredictable contexts alongside individuals with various behaviours. For this reason, personalisation is a valuable attribute for social robots as it allows them to act according to a specific user's needs and preferences and achieve natural and transparent robot behaviours for humans. If correctly implemented, personalisation could also be the key to the large-scale adoption of social robotics. However, achieving personalisation is arduous as it requires us to expand the boundaries of robotics by taking advantage of the expertise of various domains. Indeed, personalised robots need to analyse and model user interactions while considering their involvement in the adaptative process. It also requires us to address ethical and socio-cultural aspects of personalised HRI to achieve inclusive and diverse interaction and avoid deception and misplaced trust when interacting with the users. At the same time, policymakers need to ensure regulations in view of possible short-term and long-term adaptive HRI. This workshop aims to raise an interdisciplinary discussion on personalisation in robotics. It aims at bringing researchers from different fields together to propose guidelines for personalisation while addressing the following questions: how to define it - how to achieve it - and how it should be guided to fit legal and ethical requirements.
AISep 26, 2022
An Application of a Runtime Epistemic Probabilistic Event Calculus to Decision-making in e-Health SystemsFabio Aurelio D'Asaro, Luca Raggioli, Salim Malek et al.
We present and discuss a runtime architecture that integrates sensorial data and classifiers with a logic-based decision-making system in the context of an e-Health system for the rehabilitation of children with neuromotor disorders. In this application, children perform a rehabilitation task in the form of games. The main aim of the system is to derive a set of parameters the child's current level of cognitive and behavioral performance (e.g., engagement, attention, task accuracy) from the available sensors and classifiers (e.g., eye trackers, motion sensors, emotion recognition techniques) and take decisions accordingly. These decisions are typically aimed at improving the child's performance by triggering appropriate re-engagement stimuli when their attention is low, by changing the game or making it more difficult when the child is losing interest in the task as it is too easy. Alongside state-of-the-art techniques for emotion recognition and head pose estimation, we use a runtime variant of a probabilistic and epistemic logic programming dialect of the Event Calculus, known as the Epistemic Probabilistic Event Calculus. In particular, the probabilistic component of this symbolic framework allows for a natural interface with the machine learning techniques. We overview the architecture and its components, and show some of its characteristics through a discussion of a running example and experiments. Under consideration for publication in Theory and Practice of Logic Programming (TPLP).
ROMar 13, 2022
A ROS Architecture for Personalised HRI with a Bartender Social RobotAlessandra Rossi, Maria Di Maro, Antonio Origlia et al.
BRILLO (Bartending Robot for Interactive Long-Lasting Operations) project has the overall goal of creating an autonomous robotic bartender that can interact with customers while accomplishing its bartending tasks. In such a scenario, people's novelty effect connected to the use of an attractive technology is destined to wear off and, consequently, it negatively affects the success of the service robotics application. For this reason, providing personalised natural interaction while accessing its services is of paramount importance for increasing users' engagement and, consequently, their loyalty. In this paper, we present the developed three-layers ROS architecture integrating a perception layer managing the processing of different social signals, a decision-making layer for handling multi-party interactions, and an execution layer controlling the behaviour of a complex robot composed of arms and a face. Finally, user modelling through a beliefs layer allows for personalised interaction.
CVJul 19, 2023
AGAR: Attention Graph-RNN for Adaptative Motion Prediction of Point Clouds of Deformable ObjectsPedro Gomes, Silvia Rossi, Laura Toni
This paper focuses on motion prediction for point cloud sequences in the challenging case of deformable 3D objects, such as human body motion. First, we investigate the challenges caused by deformable shapes and complex motions present in this type of representation, with the ultimate goal of understanding the technical limitations of state-of-the-art models. From this understanding, we propose an improved architecture for point cloud prediction of deformable 3D objects. Specifically, to handle deformable shapes, we propose a graph-based approach that learns and exploits the spatial structure of point clouds to extract more representative features. Then we propose a module able to combine the learned features in an adaptative manner according to the point cloud movements. The proposed adaptative module controls the composition of local and global motions for each point, enabling the network to model complex motions in deformable 3D objects more effectively. We tested the proposed method on the following datasets: MNIST moving digits, the Mixamo human bodies motions, JPEG and CWIPC-SXR real-world dynamic bodies. Simulation results demonstrate that our method outperforms the current baseline methods given its improved ability to model complex movements as well as preserve point cloud shape. Furthermore, we demonstrate the generalizability of the proposed framework for dynamic feature learning, by testing the framework for action recognition on the MSRAction3D dataset and achieving results on-par with state-of-the-art methods
AIMar 25
Resisting Humanization: Ethical Front-End Design Choices in AI for Sensitive ContextsSilvia Rossi, Diletta Huyskes, Mackenzie Jorgensen
Ethical debates in AI have primarily focused on back-end issues such as data governance, model training, and algorithmic decision-making. Less attention has been paid to the ethical significance of front-end design choices, such as the interaction and representation-based elements through which users interact with AI systems. This gap is particularly significant for Conversational User Interfaces (CUI) based on Natural Language Processing (NLP) systems, where humanizing design elements such as dialogue-based interaction, emotive language, personality modes, and anthropomorphic metaphors are increasingly prevalent. This work argues that humanization in AI front-end design is a value-driven choice that profoundly shapes users' mental models, trust calibration, and behavioral responses. Drawing on research in human-computer interaction (HCI), conversational AI, and value-sensitive design, we examine how interfaces can play a central role in misaligning user expectations, fostering misplaced trust, and subtly undermining user autonomy, especially in vulnerable contexts. To ground this analysis, we discuss two AI systems developed by Chayn, a nonprofit organization supporting survivors of gender-based violence. Chayn is extremely cautious when building AI that interacts with or impacts survivors by operationalizing their trauma-informed design principles. This Chayn case study illustrates how ethical considerations can motivate principled restraint in interface design, challenging engagement-based norms in contemporary AI products. We argue that ethical front-end AI design is a form of procedural ethics, enacted through interaction choices rather than embedded solely in system logic.
RONov 11, 2025
PerspAct: Enhancing LLM Situated Collaboration Skills through Perspective Taking and Active VisionSabrina Patania, Luca Annese, Anita Pellegrini et al.
Recent advances in Large Language Models (LLMs) and multimodal foundation models have significantly broadened their application in robotics and collaborative systems. However, effective multi-agent interaction necessitates robust perspective-taking capabilities, enabling models to interpret both physical and epistemic viewpoints. Current training paradigms often neglect these interactive contexts, resulting in challenges when models must reason about the subjectivity of individual perspectives or navigate environments with multiple observers. This study evaluates whether explicitly incorporating diverse points of view using the ReAct framework, an approach that integrates reasoning and acting, can enhance an LLM's ability to understand and ground the demands of other agents. We extend the classic Director task by introducing active visual exploration across a suite of seven scenarios of increasing perspective-taking complexity. These scenarios are designed to challenge the agent's capacity to resolve referential ambiguity based on visual access and interaction, under varying state representations and prompting strategies, including ReAct-style reasoning. Our results demonstrate that explicit perspective cues, combined with active exploration strategies, significantly improve the model's interpretative accuracy and collaborative effectiveness. These findings highlight the potential of integrating active perception with perspective-taking mechanisms in advancing LLMs' application in robotics and multi-agent systems, setting a foundation for future research into adaptive and context-aware AI systems.
CLSep 15, 2025
Growing Perspectives: Modelling Embodied Perspective Taking and Inner Narrative Development Using Large Language ModelsSabrina Patania, Luca Annese, Anna Lambiase et al.
Language and embodied perspective taking are essential for human collaboration, yet few computational models address both simultaneously. This work investigates the PerspAct system [1], which integrates the ReAct (Reason and Act) paradigm with Large Language Models (LLMs) to simulate developmental stages of perspective taking, grounded in Selman's theory [2]. Using an extended director task, we evaluate GPT's ability to generate internal narratives aligned with specified developmental stages, and assess how these influence collaborative performance both qualitatively (action selection) and quantitatively (task efficiency). Results show that GPT reliably produces developmentally-consistent narratives before task execution but often shifts towards more advanced stages during interaction, suggesting that language exchanges help refine internal representations. Higher developmental stages generally enhance collaborative effectiveness, while earlier stages yield more variable outcomes in complex contexts. These findings highlight the potential of integrating embodied perspective taking and language in LLMs to better model developmental dynamics and stress the importance of evaluating internal speech during combined linguistic and embodied tasks.
AIAug 20, 2025
Who Sees What? Structured Thought-Action Sequences for Epistemic Reasoning in LLMsLuca Annese, Sabrina Patania, Silvia Serino et al.
Recent advances in large language models (LLMs) and reasoning frameworks have opened new possibilities for improving the perspective -taking capabilities of autonomous agents. However, tasks that involve active perception, collaborative reasoning, and perspective taking (understanding what another agent can see or knows) pose persistent challenges for current LLM-based systems. This study investigates the potential of structured examples derived from transformed solution graphs generated by the Fast Downward planner to improve the performance of LLM-based agents within a ReAct framework. We propose a structured solution-processing pipeline that generates three distinct categories of examples: optimal goal paths (G-type), informative node paths (E-type), and step-by-step optimal decision sequences contrasting alternative actions (L-type). These solutions are further converted into ``thought-action'' examples by prompting an LLM to explicitly articulate the reasoning behind each decision. While L-type examples slightly reduce clarification requests and overall action steps, they do not yield consistent improvements. Agents are successful in tasks requiring basic attentional filtering but struggle in scenarios that required mentalising about occluded spaces or weighing the costs of epistemic actions. These findings suggest that structured examples alone are insufficient for robust perspective-taking, underscoring the need for explicit belief tracking, cost modelling, and richer environments to enable socially grounded collaboration in LLM-based agents.
RONov 11, 2024
Enhancing Robot Assistive Behaviour with Reinforcement Learning and Theory of MindAntonio Andriella, Giovanni Falcone, Silvia Rossi
The adaptation to users' preferences and the ability to infer and interpret humans' beliefs and intents, which is known as the Theory of Mind (ToM), are two crucial aspects for achieving effective human-robot collaboration. Despite its importance, very few studies have investigated the impact of adaptive robots with ToM abilities. In this work, we present an exploratory comparative study to investigate how social robots equipped with ToM abilities impact users' performance and perception. We design a two-layer architecture. The Q-learning agent on the first layer learns the robot's higher-level behaviour. On the second layer, a heuristic-based ToM infers the user's intended strategy and is responsible for implementing the robot's assistance, as well as providing the motivation behind its choice. We conducted a user study in a real-world setting, involving 56 participants who interacted with either an adaptive robot capable of ToM, or with a robot lacking such abilities. Our findings suggest that participants in the ToM condition performed better, accepted the robot's assistance more often, and perceived its ability to adapt, predict and recognise their intents to a higher degree. Our preliminary insights could inform future research and pave the way for designing more complex computation architectures for adaptive behaviour with ToM capabilities.
HCDec 17, 2021
Extending 3-DoF Metrics to Model User Behaviour Similarity in 6-DoF Immersive ApplicationsSilvia Rossi, Irene Viola, Laura Toni et al.
Immersive reality technologies, such as Virtual and Augmented Reality, have ushered a new era of user-centric systems, in which every aspect of the coding--delivery--rendering chain is tailored to the interaction of the users. Understanding the actual interactivity and behaviour of the users is still an open challenge and a key step to enabling such a user-centric system. Our main goal is to extend the applicability of existing behavioural methodologies for studying user navigation in the case of 6 Degree-of-Freedom (DoF). Specifically, we first compare the navigation in 6-DoF with its 3-DoF counterpart highlighting the main differences and novelties. Then, we define new metrics aimed at better modelling behavioural similarities between users in a 6-DoF system. We validate and test our solutions on real navigation paths of users interacting with dynamic volumetric media in 6-DoF Virtual Reality conditions. Our results show that metrics that consider both user position and viewing direction better perform in detecting user similarity while navigating in a 6-DoF system. Having easy-to-use but robust metrics that underpin multiple tools and answer the question ``how do we detect if two users look at the same content?" open the gate to new solutions for a user-centric system.
ROMar 24, 2021
I Know What You Would Like to Drink: Benefits and Detriments of Sharing Personal Info with a Bartender RobotAlessandra Rossi, Vito Giura, Carmine Di Leva et al.
This paper introduces benefits and detriments of a robot bartender that is capable of adapting the interaction with human users according to their preferences in drinks, music, and hobbies. We believe that a personalised experience during a human-robot interaction increases the human user's engagement with the robot and that such information will be used by the robot during the interaction. However, this implies that the users need to share several personal information with the robot. In this paper, we introduce the research topic and our approach to evaluate people's perceptions and consideration of their privacy with a robot. We present a within-subject study in which participants interacted twice with a robot that firstly had not any previous info about the users, and, then, having a knowledge of their preferences. We observed that less than 60\% of the participants were not concerned about sharing personal information with the robot.
ROMar 23, 2021
The Road to a Successful HRI: AI, Trust and ethicS-TRAITSAntonio Andriella, Alessandra Rossi, Silvia Rossi et al.
The aim of this workshop is to give researchers from academia and industry the possibility to discuss the inter-and multi-disciplinary nature of the relationships between people and robots towards effective and long-lasting collaborations. This workshop will provide a forum for the HRI and robotics communities to explore successful human-robot interaction (HRI) to analyse the different aspects of HRI that impact its success. Particular focus are the AI algorithms required to implement autonomous interactions, and the factors that enhance, undermine, or recover humans' trust in robots. Finally, potential ethical and legal concerns, and how they can be addressed will be considered. Website: https://sites.google.com/view/traits-hri
CVFeb 15, 2021
Spatio-temporal Graph-RNN for Point Cloud PredictionPedro Gomes, Silvia Rossi, Laura Toni
In this paper, we propose an end-to-end learning network to predict future frames in a point cloud sequence. As main novelty, an initial layer learns topological information of point clouds as geometric features, to form representative spatio-temporal neighborhoods. This module is followed by multiple Graph-RNN cells. Each cell learns points dynamics (i.e., RNN states) by processing each point jointly with the spatio-temporal neighbouring points. We tested the network performance with a MINST dataset of moving digits, a synthetic human bodies motions and JPEG dynamic bodies datasets. Simulation results demonstrate that our method outperforms baseline ones that neglect geometry features information.
ROMay 12, 2020
Towards Transparency of TD-RL Robotic Systems with a Human TeacherMarco Matarese, Silvia Rossi, Alessandra Sciutti et al.
The high request for autonomous and flexible HRI implies the necessity of deploying Machine Learning (ML) mechanisms in the robot control. Indeed, the use of ML techniques, such as Reinforcement Learning (RL), makes the robot behaviour, during the learning process, not transparent to the observing user. In this work, we proposed an emotional model to improve the transparency in RL tasks for human-robot collaborative scenarios. The architecture we propose supports the RL algorithm with an emotional model able to both receive human feedback and exhibit emotional responses based on the learning process. The model is entirely based on the Temporal Difference (TD) error. The architecture was tested in an isolated laboratory with a simple setup. The results highlight that showing its internal state through an emotional response is enough to make a robot transparent to its human teacher. People also prefer to interact with a responsive robot because they are used to understand their intentions via emotions and social signals.
ROSep 11, 2019
Emotional Distraction for Children Anxiety Reduction During VaccinationMartina Ruocco, Marwa Larafa, Silvia Rossi
Social assistive robots are starting to be widely used in pediatric health-care environments with the aim of distracting and entertaining children, and so of reducing a possible state of anxiety. In this paper, we present some initial results of a study (N=69) conducted in a Health-Vaccines Center, where the distraction role of a social robot, which interacts with a child showing an emotional behavior, is compared with the same not showing any emotional social cue. Outcome criteria for the evaluation of the intervention included the parents reported level of anxiety before, during and after the procedure.
MMNov 13, 2018
Spherical clustering of users navigating 360° contentSilvia Rossi, Francesca De Simone, Pascal Frossard et al.
In Virtual Reality (VR) applications, understanding how users explore the omnidirectional content is important to optimize content creation, to develop user-centric services, or even to detect disorders in medical applications. Clustering users based on their common navigation patterns is a first direction to understand users behaviour. However, classical clustering techniques fail in identifying these common paths, since they are usually focused on minimizing a simple distance metric. In this paper, we argue that minimizing the distance metric does not necessarily guarantee to identify users that experience similar navigation path in the VR domain. Therefore, we propose a graph-based method to identify clusters of users who are attending the same portion of the spherical content over time. The proposed solution takes into account the spherical geometry of the content and aims at clustering users based on the actual overlap of displayed content among users. Our method is tested on real VR user navigation patterns. Results show that our solution leads to clusters in which at least 85% of the content displayed by one user is shared among the other users belonging to the same cluster.