Antonio Sgorbissa

h-index29

14papers

2,655citations

Novelty36%

AI Score45

Ranked #42,074 of 194,257 authors (top 22%)#1,133 in RO (top 17%)

14 Papers

7.1ROJun 3Code

BPDA-GMM: Bayesian Probabilistic Data Association via Gaussian Mixture Models for Semantic SLAM

Thanh Nguyen Canh, Haolan Zhang, Xiem HoangVan et al.

Probabilistic data association (PDA) improves semantic SLAM in perceptually aliased scenes, but existing methods often assume a fixed landmark set, recompute association weights as the map grows, or rely on hand-tuned null-hypothesis weights. To address these limitations, we propose \textbf{BPDA-GMM}, an online Bayesian PDA framework for semantic SLAM with a growing object-level map. BPDA-GMM uses a Dirichlet-process prior to induce a Chinese Restaurant Process (CRP) association model, where accumulated evidence favors existing landmarks, and the concentration parameter assigns probability mass to new landmarks. For each semantic detection, plausible candidates are selected by a joint semantic-geometric gate, CRP-weighted association probabilities are computed, and object landmarks are updated as semantic Gaussians in closed form. The resulting landmark set forms a Gaussian mixture model, and its dominant component is passed to the back-end as a max-mixture semantic factor. When association weights are inconclusive, an ambiguity-triggered $α$-divergence tempering step improves discrimination. Finally, a decoupled back-end zeroes the pose Jacobian of semantic factors, allowing noisy detections to refine landmarks without directly perturbing the trajectory. Experiments in simulation and on a real indoor dataset demonstrate improved trajectory accuracy, semantic mapping quality, and robustness to perceptual aliasing and classifier errors over state-of-the-art baselines. Code and video are publicly available at https://github.com/thanhnguyencanh/BPDA-SLAM.

5.5ROJul 12, 2022

Diversity-aware social robots meet people: beyond context-aware embodied AI

Carmine Recchiuto, Antonio Sgorbissa

The article introduces the concept of "diversity-aware" robotics and discusses the need to develop computational models to embed robots with diversity-awareness: that is, robots capable of adapting and re-configuring their behavior to recognize, respect, and value the uniqueness of the person they interact with to promote inclusion regardless of their age, race, gender, cognitive or physical capabilities, etc. Finally, the article discusses possible technical solutions based on Ontologies and Bayesian Networks, starting from previous experience with culturally competent robots.

4.9ROJun 5

HORUS: A Mixed Reality Interface for Managing Teams of Mobile Robots

Omotoye Shamsudeen Adekoya, Antonio Sgorbissa, Carmine Tommaso Recchiuto

Mixed Reality (MR) interfaces have been extensively explored for controlling mobile robots, but there is limited research on their application to managing teams of robots. This paper presents HORUS: Holistic Operational Reality for Unified Systems, a Mixed Reality interface offering a comprehensive set of tools for managing multiple mobile robots simultaneously. HORUS enables operators to monitor individual robot statuses, visualize sensor data projected in real time, and assign tasks to single robots, subsets of the team, or the entire group, all from a Mini-Map (Ground Station). The interface also provides different teleoperation modes: a mini-map mode that allows teleoperation while observing the robot model and its transform on the mini-map, and a semi-immersive mode that offers a flat, screen-like view in either single or stereo view (3D). We conducted a user study in which participants used HORUS to manage a team of mobile robots tasked with finding clues in an environment, simulating search and rescue tasks. This study compared HORUS's full-team management capabilities with individual robot teleoperation. The experiments validated the versatility and effectiveness of HORUS in multi-robot coordination, demonstrating its potential to advance human-robot collaboration in dynamic, team-based environments.

4.1ROJul 31, 2024

Moderating Group Conversation Dynamics with Social Robots

Lucrezia Grassi, Carmine Tommaso Recchiuto, Antonio Sgorbissa

This research investigates the impact of social robot participation in group conversations and assesses the effectiveness of various addressing policies. The study involved 300 participants, divided into groups of four, interacting with a humanoid robot serving as the moderator. The robot utilized conversation data to determine the most appropriate speaker to address. The findings indicate that the robot's addressing policy significantly influenced conversation dynamics, resulting in more balanced attention to each participant and a reduction in subgroup formation.

3.3CVJun 29

SICAGE: Speaker-Independent Culture-Aware Gesture Generation using TED4C-L Dataset

Ariel Gjaci, Antonio Sgorbissa, Vittorio Murino

Recent co-speech gesture generation methods often overlook cultural differences, limiting their effectiveness in human-agent interaction. Moreover, culture-conditioned models are rarely evaluated under speaker-disjoint splits, so apparent "cultural" behavior may be confounded with speaker-specific gesturing style. We introduce SICAGE, a modular framework for culture-aware co-speech gesture generation that conditions motion synthesis models on speaker-independent cultural representations. SICAGE learns these representations from audio and text by treating each speaker as a separate domain while imposing invariance across speakers. This encourages representations to remain culture-discriminative while reducing dependence on speaker identity. The resulting cultural embeddings condition a multimodal generator to produce culturally appropriate gestures. We instantiate this idea with two domain generalization approaches: adversarial learning and Fishr regularization. We further introduce ALaDiT, a real-time diffusion-based gesture generator designed to efficiently incorporate the learned cultural embeddings. To validate our method, we built TED4C-L, a 106-hour multimodal dataset of 764 TED speakers from four cultural groups. Experiments show that SICAGE improves motion realism, diversity, beat synchronization, semantic relevance, and cultural consistency.

6.5ROJun 5

A Multi-Operator Mixed-Reality Interface for Multi-Robot Control and Coordination: Co-Located and Private Workspace Collaboration

Omotoye Shamsudeen Adekoya, Antonio Sgorbissa, Carmine Tommaso Recchiuto

Multi-operator control of robot teams requires not only access to the same mission information, but also mechanisms for maintaining shared awareness and preventing conflicting interventions. Building on our previous HORUS interface (Holistic Operational Reality for Unified Systems) we present a mixed-reality interface that extends single-operator multi-robot supervision to collaborative multi-operator use. The system supports two complementary modes: a co-located shared workspace, in which operators observe and manipulate the same mini-map in the same physical location, and a private-workspace mode, in which operators work on the same mission through independently placed local workspaces. The architecture combines registration-driven scene construction, lightweight shared-session synchronization, and per-robot control leases to support collaborative monitoring, tasking, and teleoperation while preventing conflicting commands. We evaluated the approach in a human-subject study with 36 participants (18 pairs) controlling three Nova Carter mobile robots in two search environments. The performance of the objective task was comparable across the two modes, indicating that both modes supported effective mission execution. However, the co-located shared workspace significantly improved perceived collaboration, shared understanding, and handoff clarity, and was the preferred collaborative mode. These results indicate that physically co-locating the MR workspace improves how operators coordinate even when the underlying robot-control tools remain unchanged.

7.1ROJun 25, 2024

Enhancing LLM-Based Human-Robot Interaction with Nuances for Diversity Awareness

Lucrezia Grassi, Carmine Tommaso Recchiuto, Antonio Sgorbissa

This paper presents a system for diversity-aware autonomous conversation leveraging the capabilities of large language models (LLMs). The system adapts to diverse populations and individuals, considering factors like background, personality, age, gender, and culture. The conversation flow is guided by the structure of the system's pre-established knowledge base, while LLMs are tasked with various functions, including generating diversity-aware sentences. Achieving diversity-awareness involves providing carefully crafted prompts to the models, incorporating comprehensive information about users, conversation history, contextual details, and specific guidelines. To assess the system's performance, we conducted both controlled and real-world experiments, measuring a wide range of performance indicators.

6.9ROFeb 2, 2022

Thermal and Visual Tracking of Photovoltaic Plants for Autonomous UAV inspection

Luca Morando, Carmine Tommaso Recchiuto, Jacopo Callà et al.

Since photovoltaic (PV) plants require periodic maintenance, using Unmanned Aerial Vehicles (UAV) for inspections can help reduce costs. The thermal and visual inspection of PV installations is currently based on UAV photogrammetry. A UAV equipped with a Global Positioning System (GPS) receiver is assigned a flight zone: the UAV will cover it back and forth to collect images to be later composed in an orthomosaic. The UAV typically flies at a height above the ground that is appropriate to ensure that images overlap even in the presence of GPS positioning errors. However, this approach has two limitations. Firstly, it requires to cover the whole flight zone, including "empty" areas between PV module rows. Secondly, flying high above the ground limits the resolution of the images to be later inspected. The article proposes a novel approach using an autonomous UAV equipped with an RGB and a thermal camera for PV module tracking. The UAV moves along PV module rows at a lower height than usual and inspects them back and forth in a boustrophedon way by ignoring "empty" areas with no PV modules. Experimental tests performed in simulation and an actual PV plant are reported.

2.6CVJan 5, 2022

Culture-to-Culture Image Translation and User Evaluation

Giulia Zaino, Carmine Tommaso Recchiuto, Antonio Sgorbissa

The article introduces the concept of image "culturization," which we define as the process of altering the ``brushstroke of cultural features" that make objects perceived as belonging to a given culture while preserving their functionalities. First, we defined a pipeline for translating objects' images from a source to a target cultural domain based on state-of-the-art Generative Adversarial Networks. Then, we gathered data through an online questionnaire to test four hypotheses concerning the impact of images belonging to different cultural domains on Italian participants. As expected, results depend on individual tastes and preferences: however, they align with our conjecture that some people, during the interaction with an intelligent system, will prefer to be shown images modified to match their cultural background. The study has two main limitations. First, we focussed on the culturization of individual objects instead of complete scenes. However, objects play a crucial role in conveying cultural meanings and can strongly influence how an image is perceived within a specific cultural context. Understanding and addressing object-level translation is a vital step toward achieving more comprehensive scene-level translation in future research. Second, we performed experiments with Italian participants only. We think that there are unique aspects of Italian culture that make it an interesting and relevant case study for exploring the impact of image culturization. Italy is a very culturally conservative society, and Italians have specific sensitivities and expectations regarding the accurate representation of their cultural identity and traditions, which can shape individuals' preferences and inclinations toward certain visual styles, aesthetics, and design choices. As a consequence, we think they are an ideal candidate for a preliminary investigation of image culturization.

12.8ROAug 4, 2021

Knowledge-Grounded Dialogue Flow Management for Social Robots and Conversational Agents

Lucrezia Grassi, Carmine Tommaso Recchiuto, Antonio Sgorbissa

The article proposes a system for knowledge-based conversation designed for Social Robots and other conversational agents. The proposed system relies on an Ontology for the description of all concepts that may be relevant conversation topics, as well as their mutual relationships. The article focuses on the algorithm for Dialogue Management that selects the most appropriate conversation topic depending on the user's input. Moreover, it discusses strategies to ensure a conversation flow that captures, as more coherently as possible, the user's intention to drive the conversation in specific directions while avoiding purely reactive responses to what the user says. To measure the quality of the conversation, the article reports the tests performed with 100 recruited participants, comparing five conversational agents: (i) an agent addressing dialogue flow management based only on the detection of keywords in the speech, (ii) an agent based both on the detection of keywords and the Content Classification feature of Google Cloud Natural Language, (iii) an agent that picks conversation topics randomly, (iv) a human pretending to be a chatbot, and (v) one of the most famous chatbots worldwide: Replika. The subjective perception of the participants is measured both with the SASSI (Subjective Assessment of Speech System Interfaces) tool, as well as with a custom survey for measuring the subjective perception of coherence.

5.3ROApr 22, 2021

Knowledge Triggering, Extraction and Storage via Human-Robot Verbal Interaction

Lucrezia Grassi, Carmine Tommaso Recchiuto, Antonio Sgorbissa

This article describes a novel approach to expand in run-time the knowledge base of an Artificial Conversational Agent. A technique for automatic knowledge extraction from the user's sentence and four methods to insert the new acquired concepts in the knowledge base have been developed and integrated into a system that has already been tested for knowledge-based conversation between a social humanoid robot and residents of care homes. The run-time addition of new knowledge allows overcoming some limitations that affect most robots and chatbots: the incapability of engaging the user for a long time due to the restricted number of conversation topics. The insertion in the knowledge base of new concepts recognized in the user's sentence is expected to result in a wider range of topics that can be covered during an interaction, making the conversation less repetitive. Two experiments are presented to assess the performance of the knowledge extraction technique, and the efficiency of the developed insertion methods when adding several concepts in the Ontology.

5.1CYMar 22, 2018

Paving the Way for Culturally Competent Robots: a Position Paper

Barbara Bruno, Nak Young Chong, Hiroko Kamide et al.

Cultural competence is a well known requirement for an effective healthcare, widely investigated in the nursing literature. We claim that personal assistive robots should likewise be culturally competent, aware of general cultural characteristics and of the different forms they take in different individuals, and sensitive to cultural differences while perceiving, reasoning, and acting. Drawing inspiration from existing guidelines for culturally competent healthcare and the state-of-the-art in culturally competent robotics, we identify the key robot capabilities which enable culturally competent behaviours and discuss methodologies for their development and evaluation.

1.7CVMar 21, 2018

Modelling the Influence of Cultural Information on Vision-Based Human Home Activity Recognition

Roberto Menicatti, Barbara Bruno, Antonio Sgorbissa

Daily life activities, such as eating and sleeping, are deeply influenced by a person's culture, hence generating differences in the way a same activity is performed by individuals belonging to different cultures. We argue that taking cultural information into account can improve the performance of systems for the automated recognition of human activities. We propose four different solutions to the problem and present a system which uses a Naive Bayes model to associate cultural information with semantic information extracted from still images. Preliminary experiments with a dataset of images of individuals lying on the floor, sleeping on a futon and sleeping on a bed suggest that: i) solutions explicitly taking cultural information into account are more accurate than culture-unaware solutions; and ii) the proposed system is a promising starting point for the development of culture-aware Human Activity Recognition methods.

4.5ROAug 21, 2017

The CARESSES EU-Japan project: making assistive robots culturally competent

Barbara Bruno, Nak Young Chong, Hiroko Kamide et al.

The nursing literature shows that cultural competence is an important requirement for effective healthcare. We claim that personal assistive robots should likewise be culturally competent, that is, they should be aware of general cultural characteristics and of the different forms they take in different individuals, and take these into account while perceiving, reasoning, and acting. The CARESSES project is an Europe-Japan collaborative effort that aims at designing, developing and evaluating culturally competent assistive robots. These robots will be able to adapt the way they behave, speak and interact to the cultural identity of the person they assist. This paper describes the approach taken in the CARESSES project, its initial steps, and its future plans.