LGMar 12, 2024
Federated Learning of Socially Appropriate Agent Behaviours in Simulated Home EnvironmentsSaksham Checker, Nikhil Churamani, Hatice Gunes
As social robots become increasingly integrated into daily life, ensuring their behaviours align with social norms is crucial. For their widespread open-world application, it is important to explore Federated Learning (FL) settings where individual robots can learn about their unique environments while also learning from each others' experiences. In this paper, we present a novel FL benchmark that evaluates different strategies, using multi-label regression objectives, where each client individually learns to predict the social appropriateness of different robot actions while also sharing their learning with others. Furthermore, splitting the training data by different contexts such that each client incrementally learns across contexts, we present a novel Federated Continual Learning (FCL) benchmark that adapts FL-based methods to use state-of-the-art Continual Learning (CL) methods to continually learn socially appropriate agent behaviours under different contextual settings. Federated Averaging (FedAvg) of weights emerges as a robust FL strategy while rehearsal-based FCL enables incrementally learning the social appropriateness of robot actions, across contextual splits.
ROMar 16, 2024
Feature Aggregation with Latent Generative Replay for Federated Continual Learning of Socially Appropriate Robot BehavioursNikhil Churamani, Saksham Checker, Fethiye Irmak Dogan et al.
It is critical for robots to explore Federated Learning (FL) settings where several robots, deployed in parallel, can learn independently while also sharing their learning with each other. This collaborative learning in real-world environments requires social robots to adapt dynamically to changing and unpredictable situations and varying task settings. Our work contributes to addressing these challenges by exploring a simulated living room environment where robots need to learn the social appropriateness of their actions. First, we propose Federated Root (FedRoot) averaging, a novel weight aggregation strategy which disentangles feature learning across clients from individual task-based learning. Second, to adapt to challenging environments, we extend FedRoot to Federated Latent Generative Replay (FedLGR), a novel Federated Continual Learning (FCL) strategy that uses FedRoot-based weight aggregation and embeds each client with a generator model for pseudo-rehearsal of learnt feature embeddings to mitigate forgetting in a resource-efficient manner. Our results show that FedRoot-based methods offer competitive performance while also resulting in a sizeable reduction in resource consumption (up to 86% for CPU usage and up to 72% for GPU usage). Additionally, our results demonstrate that FedRoot-based FCL methods outperform other methods while also offering an efficient solution (up to 84% CPU and 92% GPU usage reduction), with FedLGR providing the best results across evaluations.
LGFeb 17, 2025
Continual Learning Should Move Beyond Incremental ClassificationRupert Mitchell, Antonio Alliegro, Raffaello Camoriano et al.
Continual learning (CL) is the sub-field of machine learning concerned with accumulating knowledge in dynamic environments. So far, CL research has mainly focused on incremental classification tasks, where models learn to classify new categories while retaining knowledge of previously learned ones. Here, we argue that maintaining such a focus limits both theoretical development and practical applicability of CL methods. Through a detailed analysis of concrete examples - including multi-target classification, robotics with constrained output spaces, learning in continuous task domains, and higher-level concept memorization - we demonstrate how current CL approaches often fail when applied beyond standard classification. We identify three fundamental challenges: (C1) the nature of continuity in learning problems, (C2) the choice of appropriate spaces and metrics for measuring similarity, and (C3) the role of learning objectives beyond classification. For each challenge, we provide specific recommendations to help move the field forward, including formalizing temporal dynamics through distribution processes, developing principled approaches for continuous task spaces, and incorporating density estimation and generative objectives. In so doing, this position paper aims to broaden the scope of CL research while strengthening its theoretical foundations, making it more applicable to real-world problems.
CVNov 20, 2025
Graph Neural Networks for Surgical Scene SegmentationYihan Li, Nikhil Churamani, Maria Robu et al.
Purpose: Accurate identification of hepatocystic anatomy is critical to preventing surgical complications during laparoscopic cholecystectomy. Deep learning models often struggle with occlusions, long-range dependencies, and capturing the fine-scale geometry of rare structures. This work addresses these challenges by introducing graph-based segmentation approaches that enhance spatial and semantic understanding in surgical scene analyses. Methods: We propose two segmentation models integrating Vision Transformer (ViT) feature encoders with Graph Neural Networks (GNNs) to explicitly model spatial relationships between anatomical regions. (1) A static k Nearest Neighbours (k-NN) graph with a Graph Convolutional Network with Initial Residual and Identity Mapping (GCNII) enables stable long-range information propagation. (2) A dynamic Differentiable Graph Generator (DGG) with a Graph Attention Network (GAT) supports adaptive topology learning. Both models are evaluated on the Endoscapes-Seg50 and CholecSeg8k benchmarks. Results: The proposed approaches achieve up to 7-8% improvement in Mean Intersection over Union (mIoU) and 6% improvement in Mean Dice (mDice) scores over state-of-the-art baselines. It produces anatomically coherent predictions, particularly on thin, rare and safety-critical structures. Conclusion: The proposed graph-based segmentation methods enhance both performance and anatomical consistency in surgical scene segmentation. By combining ViT-based global context with graph-based relational reasoning, the models improve interpretability and reliability, paving the way for safer laparoscopic and robot-assisted surgery through a precise identification of critical anatomical features.
CVMay 10, 2023
Continual Facial Expression Recognition: A BenchmarkNikhil Churamani, Tolga Dimlioglu, German I. Parisi et al.
Understanding human affective behaviour, especially in the dynamics of real-world settings, requires Facial Expression Recognition (FER) models to continuously adapt to individual differences in user expression, contextual attributions, and the environment. Current (deep) Machine Learning (ML)-based FER approaches pre-trained in isolation on benchmark datasets fail to capture the nuances of real-world interactions where data is available only incrementally, acquired by the agent or robot during interactions. New learning comes at the cost of previous knowledge, resulting in catastrophic forgetting. Lifelong or Continual Learning (CL), on the other hand, enables adaptability in agents by being sensitive to changing data distributions, integrating new information without interfering with previously learnt knowledge. Positing CL as an effective learning paradigm for FER, this work presents the Continual Facial Expression Recognition (ConFER) benchmark that evaluates popular CL techniques on FER tasks. It presents a comparative analysis of several CL-based approaches on popular FER datasets such as CK+, RAF-DB, and AffectNet and present strategies for a successful implementation of ConFER for Affective Computing (AC) research. CL techniques, under different learning settings, are shown to achieve state-of-the-art (SOTA) performance across several datasets, thus motivating a discussion on the benefits of applying CL principles towards human behaviour understanding, particularly from facial expressions, as well the challenges entailed.
CVMar 15, 2021
Towards Fair Affective Robotics: Continual Learning for Mitigating Bias in Facial Expression and Action Unit RecognitionOzgur Kara, Nikhil Churamani, Hatice Gunes
As affective robots become integral in human life, these agents must be able to fairly evaluate human affective expressions without discriminating against specific demographic groups. Identifying bias in Machine Learning (ML) systems as a critical problem, different approaches have been proposed to mitigate such biases in the models both at data and algorithmic levels. In this work, we propose Continual Learning (CL) as an effective strategy to enhance fairness in Facial Expression Recognition (FER) systems, guarding against biases arising from imbalances in data distributions. We compare different state-of-the-art bias mitigation approaches with CL-based strategies for fairness on expression recognition and Action Unit (AU) detection tasks using popular benchmarks for each; RAF-DB and BP4D. Our experiments show that CL-based methods, on average, outperform popular bias mitigation techniques, strengthening the need for further investigation into CL for the development of fairer FER algorithms.
CVMar 15, 2021
Domain-Incremental Continual Learning for Mitigating Bias in Facial Expression and Action Unit RecognitionNikhil Churamani, Ozgur Kara, Hatice Gunes
As Facial Expression Recognition (FER) systems become integrated into our daily lives, these systems need to prioritise making fair decisions instead of aiming at higher individual accuracy scores. Ranging from surveillance systems to diagnosing mental and emotional health conditions of individuals, these systems need to balance the accuracy vs fairness trade-off to make decisions that do not unjustly discriminate against specific under-represented demographic groups. Identifying bias as a critical problem in facial analysis systems, different methods have been proposed that aim to mitigate bias both at data and algorithmic levels. In this work, we propose the novel usage of Continual Learning (CL), in particular, using Domain-Incremental Learning (Domain-IL) settings, as a potent bias mitigation method to enhance the fairness of FER systems while guarding against biases arising from skewed data distributions. We compare different non-CL-based and CL-based methods for their classification accuracy and fairness scores on expression recognition and Action Unit (AU) detection tasks using two popular benchmarks, the RAF-DB and BP4D datasets, respectively. Our experimental results show that CL-based methods, on average, outperform other popular bias mitigation techniques on both accuracy and fairness metrics.
CVNov 17, 2020
Spatio-Temporal Analysis of Facial Actions using Lifecycle-Aware Capsule NetworksNikhil Churamani, Sinan Kalkan, Hatice Gunes
Most state-of-the-art approaches for Facial Action Unit (AU) detection rely upon evaluating facial expressions from static frames, encoding a snapshot of heightened facial activity. In real-world interactions, however, facial expressions are usually more subtle and evolve in a temporal manner requiring AU detection models to learn spatial as well as temporal information. In this paper, we focus on both spatial and spatio-temporal features encoding the temporal evolution of facial AU activation. For this purpose, we propose the Action Unit Lifecycle-Aware Capsule Network (AULA-Caps) that performs AU detection using both frame and sequence-level features. While at the frame-level the capsule layers of AULA-Caps learn spatial feature primitives to determine AU activations, at the sequence-level, it learns temporal dependencies between contiguous frames by focusing on relevant spatio-temporal segments in the sequence. The learnt feature capsules are routed together such that the model learns to selectively focus more on spatial or spatio-temporal information depending upon the AU lifecycle. The proposed model is evaluated on the commonly used BP4D and GFT benchmark datasets obtaining state-of-the-art results on both the datasets.
ROOct 14, 2020
Affect-Driven Modelling of Robot Personality for Collaborative Human-Robot InteractionsNikhil Churamani, Pablo Barros, Hatice Gunes et al.
Collaborative interactions require social robots to adapt to the dynamics of human affective behaviour. Yet, current approaches for affective behaviour generation in robots focus on instantaneous perception to generate a one-to-one mapping between observed human expressions and static robot actions. In this paper, we propose a novel framework for personality-driven behaviour generation in social robots. The framework consists of (i) a hybrid neural model for evaluating facial expressions and speech, forming intrinsic affective representations in the robot, (ii) an Affective Core, that employs self-organising neural models to embed robot personality traits like patience and emotional actuation, and (iii) a Reinforcement Learning model that uses the robot's affective appraisal to learn interaction behaviour. For evaluation, we conduct a user study (n = 31) where the NICO robot acts as a proposer in the Ultimatum Game. The effect of robot personality on its negotiation strategy is witnessed by participants, who rank a patient robot with high emotional actuation higher on persistence, while an inert and impatient robot higher on its generosity and altruistic behaviour.
CVSep 15, 2020
The FaceChannel: A Fast & Furious Deep Neural Network for Facial Expression RecognitionPablo Barros, Nikhil Churamani, Alessandra Sciutti
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train. Given the dynamic conditions of FER, this characteristic hinders such models of been used as a general affect recognition. In this paper, we address this problem by formalizing the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks. We introduce an inhibitory layer that helps to shape the learning of facial features in the last layer of the network and thus improving performance while reducing the number of trainable parameters. To evaluate our model, we perform a series of experiments on different benchmark datasets and demonstrate how the FaceChannel achieves a comparable, if not better, performance to the current state-of-the-art in FER. Our experiments include cross-dataset analysis, to estimate how our model behaves on different affective recognition conditions. We conclude our paper with an analysis of how FaceChannel learns and adapt the learned facial features towards the different datasets.
CVSep 14, 2020
CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future DirectionsVincenzo Lomonaco, Lorenzo Pellegrini, Pau Rodriguez et al.
In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the significant and undoubted progress of the field in addressing the issue of catastrophic forgetting, benchmarking different continual learning approaches is a difficult task by itself. In fact, given the proliferation of different settings, training and evaluation protocols, metrics and nomenclature, it is often tricky to properly characterize a continual learning algorithm, relate it to other solutions and gauge its real-world applicability. The first Continual Learning in Computer Vision challenge held at CVPR in 2020 has been one of the first opportunities to evaluate different continual learning algorithms on a common hardware with a large set of shared evaluation metrics and 3 different settings based on the realistic CORe50 video benchmark. In this paper, we report the main results of the competition, which counted more than 79 teams registered, 11 finalists and 2300$ in prizes. We also summarize the winning approaches, current challenges and future research directions.
CVJun 10, 2020
Continual Learning for Affective ComputingNikhil Churamani
Real-world application requires affect perception models to be sensitive to individual differences in expression. As each user is different and expresses differently, these models need to personalise towards each individual to adequately capture their expressions and thus, model their affective state. Despite high performance on benchmarks, current approaches fall short in such adaptation. In this work, we propose the use of Continual Learning (CL) for affective computing as a paradigm for developing personalised affect perception.
HCJun 9, 2020
Creating a Robot Coach for Mindfulness and Wellbeing: A Longitudinal StudyIndu P. Bodala, Nikhil Churamani, Hatice Gunes
Social robots are starting to become incorporated into daily lives by assisting in the promotion of physical and mental wellbeing. This paper investigates the use of social robots for delivering mindfulness sessions. We created a teleoperated robotic platform that enables an experienced human coach to conduct the sessions in a virtual manner by replicating upper body and head pose in real time. The coach is also able to view the world from the robot's perspective and make a conversation with participants by talking and listening through the robot. We studied how participants interacted with a teleoperated robot mindfulness coach over a period of 5 weeks and compared with the interactions another group of participants had with a human coach. The mindfulness sessions delivered by both types of coaching invoked positive responses from the participants for all the sessions. We found that the participants rated the interactions with human coach consistently high in all aspects. However, there is a longitudinal change in the ratings for the interaction with the teleoperated robot for the aspects of motion and conversation. We also found that the participants' personality traits -- conscientiousness and neuroticism influenced the perceptions of the robot coach.
CVApr 17, 2020
The FaceChannel: A Light-weight Deep Neural Network for Facial Expression RecognitionPablo Barros, Nikhil Churamani, Alessandra Sciutti
Current state-of-the-art models for automatic FER are based on very deep neural networks that are difficult to train. This makes it challenging to adapt these models to changing conditions, a requirement from FER models given the subjective nature of affect perception and understanding. In this paper, we address this problem by formalizing the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks. We perform a series of experiments on different benchmark datasets to demonstrate how the FaceChannel achieves a comparable, if not better, performance, as compared to the current state-of-the-art in FER.
ROMar 30, 2020
iCub: Learning Emotion Expressions using Human RewardNikhil Churamani, Francisco Cruz, Sascha Griffiths et al.
The purpose of the present study is to learn emotion expression representations for artificial agents using reward shaping mechanisms. The approach takes inspiration from the TAMER framework for training a Multilayer Perceptron (MLP) to learn to express different emotions on the iCub robot in a human-robot interaction scenario. The robot uses a combination of a Convolutional Neural Network (CNN) and a Self-Organising Map (SOM) to recognise an emotion and then learns to express the same using the MLP. The objective is to teach a robot to respond adequately to the user's perception of emotions and learn how to express different emotions.
HCAug 30, 2019
The OMG-Empathy Dataset: Evaluating the Impact of Affective Behavior in StorytellingPablo Barros, Nikhil Churamani, Angelica Lim et al.
Processing human affective behavior is important for developing intelligent agents that interact with humans in complex interaction scenarios. A large number of current approaches that address this problem focus on classifying emotion expressions by grouping them into known categories. Such strategies neglect, among other aspects, the impact of the affective responses from an individual on their interaction partner thus ignoring how people empathize towards each other. This is also reflected in the datasets used to train models for affective processing tasks. Most of the recent datasets, in particular, the ones which capture natural interactions ("in-the-wild" datasets), are designed, collected, and annotated based on the recognition of displayed affective reactions, ignoring how these displayed or expressed emotions are perceived. In this paper, we propose a novel dataset composed of dyadic interactions designed, collected and annotated with a focus on measuring the affective impact that eight different stories have on the listener. Each video of the dataset contains around 5 minutes of interaction where a speaker tells a story to a listener. After each interaction, the listener annotated, using a valence scale, how the story impacted their affective state, reflecting how they empathized with the speaker as well as the story. We also propose different evaluation protocols and a baseline that encourages participation in the advancement of the field of artificial empathy and emotion contagion.
HCJul 12, 2018
An Affective Robot Companion for Assisting the Elderly in a Cognitive Game ScenarioNikhil Churamani, Alexander Sutherland, Pablo Barros
Being able to recognize emotions in human users is considered a highly desirable trait in Human-Robot Interaction (HRI) scenarios. However, most contemporary approaches rarely attempt to apply recognized emotional features in an active manner to modulate robot decision-making and dialogue for the benefit of the user. In this position paper, we propose a method of incorporating recognized emotions into a Reinforcement Learning (RL) based dialogue management module that adapts its dialogue responses in order to attempt to make cognitive training tasks, like the 2048 Puzzle Game, more enjoyable for the users.
HCMar 14, 2018
The OMG-Emotion Behavior DatasetPablo Barros, Nikhil Churamani, Egor Lakomkin et al.
This paper is the basis paper for the accepted IJCNN challenge One-Minute Gradual-Emotion Recognition (OMG-Emotion) by which we hope to foster long-emotion classification using neural models for the benefit of the IJCNN community. The proposed corpus has as the novelty the data collection and annotation strategy based on emotion expressions which evolve over time into a specific context. Different from other corpora, we propose a novel multimodal corpus for emotion expression recognition, which uses gradual annotations with a focus on contextual emotion expressions. Our dataset was collected from Youtube videos using a specific search strategy based on restricted keywords and filtering which guaranteed that the data follow a gradual emotion expression transition, i.e. emotion expressions evolve over time in a natural and continuous fashion. We also provide an experimental protocol and a series of unimodal baseline experiments which can be used to evaluate deep and recurrent neural models in a fair and standard manner.