24.4LGMar 18
QuantFL: Sustainable Federated Learning for Edge IoT via Pre-Trained Model QuantisationCharuka Herath, Yogachandran Rahulamathavan, Varuna De Silva et al.
Federated Learning (FL) enables privacy-preserving intelligence on Internet of Things (IoT) devices but incurs a significant carbon footprint due to the high energy cost of frequent uplink transmission. While pre-trained models are increasingly available on edge devices, their potential to reduce the energy overhead of fine-tuning remains underexplored. In this work, we propose QuantFL, a sustainable FL framework that leverages pre-trained initialisation to enable aggressive, computationally lightweight quantisation. We demonstrate that pre-training naturally concentrates update statistics, allowing us to use memory-efficient bucket quantisation without the energy-intensive overhead of complex error-feedback mechanisms. On MNIST and CIFAR-100, QuantFL reduces total communication by 40\% ($\simeq40\%$ total-bit reduction with full-precision downlink; $\geq80\%$ on uplink or when downlink is quantised) while matching or exceeding uncompressed baselines under strict bandwidth budgets; BU attains 89.00\% (MNIST) and 66.89\% (CIFAR-100) test accuracy with orders of magnitude fewer bits. We also account for uplink and downlink costs and provide ablations on quantisation levels and initialisation. QuantFL delivers a practical, "green" recipe for scalable training on battery-constrained IoT networks.
LGMay 30, 2022
Residual Q-Networks for Value Function Factorizing in Multi-Agent Reinforcement LearningRafael Pina, Varuna De Silva, Joosep Hook et al.
Multi-Agent Reinforcement Learning (MARL) is useful in many problems that require the cooperation and coordination of multiple agents. Learning optimal policies using reinforcement learning in a multi-agent setting can be very difficult as the number of agents increases. Recent solutions such as Value Decomposition Networks (VDN), QMIX, QTRAN and QPLEX adhere to the centralized training and decentralized execution scheme and perform factorization of the joint action-value functions. However, these methods still suffer from increased environmental complexity, and at times fail to converge in a stable manner. We propose a novel concept of Residual Q-Networks (RQNs) for MARL, which learns to transform the individual Q-value trajectories in a way that preserves the Individual-Global-Max criteria (IGM), but is more robust in factorizing action-value functions. The RQN acts as an auxiliary network that accelerates convergence and will become obsolete as the agents reach the training objectives. The performance of the proposed method is compared against several state-of-the-art techniques such as QPLEX, QMIX, QTRAN and VDN, in a range of multi-agent cooperative tasks. The results illustrate that the proposed method, in general, converges faster, with increased stability and shows robust performance in a wider family of environments. The improvements in results are more prominent in environments with severe punishments for non-cooperative behaviours and especially in the absence of complete state information during training time.
CVJul 6, 2022
Text to Image Synthesis using Stacked Conditional Variational Autoencoders and Conditional Generative Adversarial NetworksHaileleol Tibebu, Aadil Malik, Varuna De Silva
Synthesizing a realistic image from textual description is a major challenge in computer vision. Current text to image synthesis approaches falls short of producing a highresolution image that represent a text descriptor. Most existing studies rely either on Generative Adversarial Networks (GANs) or Variational Auto Encoders (VAEs). GANs has the capability to produce sharper images but lacks the diversity of outputs, whereas VAEs are good at producing a diverse range of outputs, but the images generated are often blurred. Taking into account the relative advantages of both GANs and VAEs, we proposed a new stacked Conditional VAE (CVAE) and Conditional GAN (CGAN) network architecture for synthesizing images conditioned on a text description. This study uses Conditional VAEs as an initial generator to produce a high-level sketch of the text descriptor. This high-level sketch output from first stage and a text descriptor is used as an input to the conditional GAN network. The second stage GAN produces a 256x256 high resolution image. The proposed architecture benefits from a conditioning augmentation and a residual block on the Conditional GAN network to achieve the results. Multiple experiments were conducted using CUB and Oxford-102 dataset and the result of the proposed approach is compared against state-ofthe-art techniques such as StackGAN. The experiments illustrate that the proposed method generates a high-resolution image conditioned on text descriptions and yield competitive results based on Inception and Frechet Inception Score using both datasets
AIJun 20, 2023
Discovering Causality for Efficient Cooperation in Multi-Agent EnvironmentsRafael Pina, Varuna De Silva, Corentin Artaud
In cooperative Multi-Agent Reinforcement Learning (MARL) agents are required to learn behaviours as a team to achieve a common goal. However, while learning a task, some agents may end up learning sub-optimal policies, not contributing to the objective of the team. Such agents are called lazy agents due to their non-cooperative behaviours that may arise from failing to understand whether they caused the rewards. As a consequence, we observe that the emergence of cooperative behaviours is not necessarily a byproduct of being able to solve a task as a team. In this paper, we investigate the applications of causality in MARL and how it can be applied in MARL to penalise these lazy agents. We observe that causality estimations can be used to improve the credit assignment to the agents and show how it can be leveraged to improve independent learning in MARL. Furthermore, we investigate how Amortized Causal Discovery can be used to automate causality detection within MARL environments. The results demonstrate that causality relations between individual observations and the team reward can be used to detect and punish lazy agents, making them develop more intelligent behaviours. This results in improvements not only in the overall performances of the team but also in their individual capabilities. In addition, results show that Amortized Causal Discovery can be used efficiently to find causal relations in MARL.
LGMar 25, 2023
Embedding Contextual Information through Reward Shaping in Multi-Agent Learning: A Case Study from Google FootballChaoyi Gu, Varuna De Silva, Corentin Artaud et al.
Artificial Intelligence has been used to help human complete difficult tasks in complicated environments by providing optimized strategies for decision-making or replacing the manual labour. In environments including multiple agents, such as football, the most common methods to train agents are Imitation Learning and Multi-Agent Reinforcement Learning (MARL). However, the agents trained by Imitation Learning cannot outperform the expert demonstrator, which makes humans hardly get new insights from the learnt policy. Besides, MARL is prone to the credit assignment problem. In environments with sparse reward signal, this method can be inefficient. The objective of our research is to create a novel reward shaping method by embedding contextual information in reward function to solve the aforementioned challenges. We demonstrate this in the Google Research Football (GRF) environment. We quantify the contextual information extracted from game state observation and use this quantification together with original sparse reward to create the shaped reward. The experiment results in the GRF environment prove that our reward shaping method is a useful addition to state-of-the-art MARL algorithms for training agents in environments with sparse reward signal.
LGNov 5, 2023
Staged Reinforcement Learning for Complex Tasks through Decomposed EnvironmentsRafael Pina, Corentin Artaud, Xiaolan Liu et al.
Reinforcement Learning (RL) is an area of growing interest in the field of artificial intelligence due to its many notable applications in diverse fields. Particularly within the context of intelligent vehicle control, RL has made impressive progress. However, currently it is still in simulated controlled environments where RL can achieve its full super-human potential. Although how to apply simulation experience in real scenarios has been studied, how to approximate simulated problems to the real dynamic problems is still a challenge. In this paper, we discuss two methods that approximate RL problems to real problems. In the context of traffic junction simulations, we demonstrate that, if we can decompose a complex task into multiple sub-tasks, solving these tasks first can be advantageous to help minimising possible occurrences of catastrophic events in the complex task. From a multi-agent perspective, we introduce a training structuring mechanism that exploits the use of experience learned under the popular paradigm called Centralised Training Decentralised Execution (CTDE). This experience can then be leveraged in fully decentralised settings that are conceptually closer to real settings, where agents often do not have access to a central oracle and must be treated as isolated independent units. The results show that the proposed approaches improve agents performance in complex tasks related to traffic junctions, minimising potential safety-critical problems that might happen in these scenarios. Although still in simulation, the investigated situations are conceptually closer to real scenarios and thus, with these results, we intend to motivate further research in the subject.
LGNov 5, 2023
Learning Independently from Causality in Multi-Agent EnvironmentsRafael Pina, Varuna De Silva, Corentin Artaud
Multi-Agent Reinforcement Learning (MARL) comprises an area of growing interest in the field of machine learning. Despite notable advances, there are still problems that require investigation. The lazy agent pathology is a famous problem in MARL that denotes the event when some of the agents in a MARL team do not contribute to the common goal, letting the teammates do all the work. In this work, we aim to investigate this problem from a causality-based perspective. We intend to create the bridge between the fields of MARL and causality and argue about the usefulness of this link. We study a fully decentralised MARL setup where agents need to learn cooperation strategies and show that there is a causal relation between individual observations and the team reward. The experiments carried show how this relation can be used to improve independent agents in MARL, resulting not only on better performances as a team but also on the rise of more intelligent behaviours on individual agents.
AIMar 24, 2023
Causality Detection for Efficient Multi-Agent Reinforcement LearningRafael Pina, Varuna De Silva, Corentin Artaud
When learning a task as a team, some agents in Multi-Agent Reinforcement Learning (MARL) may fail to understand their true impact in the performance of the team. Such agents end up learning sub-optimal policies, demonstrating undesired lazy behaviours. To investigate this problem, we start by formalising the use of temporal causality applied to MARL problems. We then show how causality can be used to penalise such lazy agents and improve their behaviours. By understanding how their local observations are causally related to the team reward, each agent in the team can adjust their individual credit based on whether they helped to cause the reward or not. We show empirically that using causality estimations in MARL improves not only the holistic performance of the team, but also the individual capabilities of each agent. We observe that the improvements are consistent in a set of different environments.
MLMar 23, 2023
Deep Generative Multi-Agent Imitation Model as a Computational Benchmark for Evaluating Human Performance in Complex Interactive Tasks: A Case Study in FootballChaoyi Gu, Varuna De Silva
Evaluating the performance of human is a common need across many applications, such as in engineering and sports. When evaluating human performance in completing complex and interactive tasks, the most common way is to use a metric having been proved efficient for that context, or to use subjective measurement techniques. However, this can be an error prone and unreliable process since static metrics cannot capture all the complex contexts associated with such tasks and biases exist in subjective measurement. The objective of our research is to create data-driven AI agents as computational benchmarks to evaluate human performance in solving difficult tasks involving multiple humans and contextual factors. We demonstrate this within the context of football performance analysis. We train a generative model based on Conditional Variational Recurrent Neural Network (VRNN) Model on a large player and ball tracking dataset. The trained model is used to imitate the interactions between two teams and predict the performance from each team. Then the trained Conditional VRNN Model is used as a benchmark to evaluate team performance. The experimental results on Premier League football dataset demonstrates the usefulness of our method to existing state-of-the-art static metric used in football analytics.
60.9HCMay 16
ATRACT: A Trustworthy Robotic Autonomous system to support Casualty TriageTasweer Ahmad, Rafael Pina, Sandip Pradhan et al.
At a time when drones are increasingly associated with hostile operations, we re-purpose them for humanitarian and life-saving applications. However, adapting search and rescue drones for battlefield triage remains extremely challenging; the technology must perform reliably to support frontline medics who are forced to operate under extreme uncertainty, restricted access, and significant personal risk. Due to growing vulnerabilities of casualty evacuation in conflicting zones, this paper presents ATRACT (A Trustworthy Robotic Autonomous system to support Casualty Triage), a novel human-in-the-loop decision support system to enable early battlefield triage during the critical post-trauma period. ATRACT integrates drone-captured video with wearable sensor input for multi-modal learning to support casualty-state assessment, thereby addressing the limitations of existing systems. Drone video captures fine-grained behavioural cues, such as pose, posture, while body-worn sensors provide complementary physiological signals, including heart rate, breathing rate, and movement. By combining two modalities, ATRACT provides evidence to support the early judgement of medics when direct access to the casualty is delayed, risky, or restricted. To mitigate the data realism gap pertaining to injured actions, a conditional variational autoencoder is devised for data augmentation. Experimental results on our drone captured dataset show that proposed pipeline achieves 85.7% accuracy for action classification; while our lightweight CNN visual encoder remains competitive with stronger pre-trained video backbones. Overall, the results support ATRACT as a practically meaningful step towards remote triage in contested environments, where multi-modal sensing, human oversight and trustworthy decision support can improve casualty prioritisation, and lessen the exposure of frontline medics.
LGJan 29, 2024
Player Pressure Map -- A Novel Representation of Pressure in Soccer for Evaluating Player Performance in Different Game ContextsChaoyi Gu, Jiaming Na, Yisheng Pei et al.
In soccer, contextual player performance metrics are invaluable to coaches. For example, the ability to perform under pressure during matches distinguishes the elite from the average. Appropriate pressure metric enables teams to assess players' performance accurately under pressure and design targeted training scenarios to address their weaknesses. The primary objective of this paper is to leverage both tracking and event data and game footage to capture the pressure experienced by the possession team in a soccer game scene. We propose a player pressure map to represent a given game scene, which lowers the dimension of raw data and still contains rich contextual information. Not only does it serve as an effective tool for visualizing and evaluating the pressure on the team and each individual, but it can also be utilized as a backbone for accessing players' performance. Overall, our model provides coaches and analysts with a deeper understanding of players' performance under pressure so that they make data-oriented tactical decisions.
CRSep 10, 2025
DSFL: A Dual-Server Byzantine-Resilient Federated Learning Framework via Group-Based Secure AggregationCharuka Herath, Yogachandran Rahulamathavan, Varuna De Silva et al.
Federated Learning (FL) enables decentralized model training without sharing raw data, offering strong privacy guarantees. However, existing FL protocols struggle to defend against Byzantine participants, maintain model utility under non-independent and identically distributed (non-IID) data, and remain lightweight for edge devices. Prior work either assumes trusted hardware, uses expensive cryptographic tools, or fails to address privacy and robustness simultaneously. We propose DSFL, a Dual-Server Byzantine-Resilient Federated Learning framework that addresses these limitations using a group-based secure aggregation approach. Unlike LSFL, which assumes non-colluding semi-honest servers, DSFL removes this dependency by revealing a key vulnerability: privacy leakage through client-server collusion. DSFL introduces three key innovations: (1) a dual-server secure aggregation protocol that protects updates without encryption or key exchange, (2) a group-wise credit-based filtering mechanism to isolate Byzantine clients based on deviation scores, and (3) a dynamic reward-penalty system for enforcing fair participation. DSFL is evaluated on MNIST, CIFAR-10, and CIFAR-100 under up to 30 percent Byzantine participants in both IID and non-IID settings. It consistently outperforms existing baselines, including LSFL, homomorphic encryption methods, and differential privacy approaches. For example, DSFL achieves 97.15 percent accuracy on CIFAR-10 and 68.60 percent on CIFAR-100, while FedAvg drops to 9.39 percent under similar threats. DSFL remains lightweight, requiring only 55.9 ms runtime and 1088 KB communication per round.
CLJul 12, 2025
PLEX: Perturbation-free Local Explanations for LLM-Based Text ClassificationYogachandran Rahulamathavan, Misbah Farooq, Varuna De Silva
Large Language Models (LLMs) excel in text classification, but their complexity hinders interpretability, making it difficult to understand the reasoning behind their predictions. Explainable AI (XAI) methods like LIME and SHAP offer local explanations by identifying influential words, but they rely on computationally expensive perturbations. These methods typically generate thousands of perturbed sentences and perform inferences on each, incurring a substantial computational burden, especially with LLMs. To address this, we propose \underline{P}erturbation-free \underline{L}ocal \underline{Ex}planation (PLEX), a novel method that leverages the contextual embeddings extracted from the LLM and a ``Siamese network" style neural network trained to align with feature importance scores. This one-off training eliminates the need for subsequent perturbations, enabling efficient explanations for any new sentence. We demonstrate PLEX's effectiveness on four different classification tasks (sentiment, fake news, fake COVID-19 news and depression), showing more than 92\% agreement with LIME and SHAP. Our evaluation using a ``stress test" reveals that PLEX accurately identifies influential words, leading to a similar decline in classification accuracy as observed with LIME and SHAP when these words are removed. Notably, in some cases, PLEX demonstrates superior performance in capturing the impact of key features. PLEX dramatically accelerates explanation, reducing time and computational overhead by two and four orders of magnitude, respectively. This work offers a promising solution for explainable LLM-based text classification.
LGJan 26, 2024
Fully Independent Communication in Multi-Agent Reinforcement LearningRafael Pina, Varuna De Silva, Corentin Artaud et al.
Multi-Agent Reinforcement Learning (MARL) comprises a broad area of research within the field of multi-agent systems. Several recent works have focused specifically on the study of communication approaches in MARL. While multiple communication methods have been proposed, these might still be too complex and not easily transferable to more practical contexts. One of the reasons for that is due to the use of the famous parameter sharing trick. In this paper, we investigate how independent learners in MARL that do not share parameters can communicate. We demonstrate that this setting might incur into some problems, to which we propose a new learning scheme as a solution. Our results show that, despite the challenges, independent agents can still learn communication strategies following our method. Additionally, we use this method to investigate how communication in MARL is affected by different network capacities, both for sharing and not sharing parameters. We observe that communication may not always be needed and that the chosen agent network sizes need to be considered when used together with communication in order to achieve efficient learning.
LGSep 4, 2023
Passing Heatmap Prediction Based on Transformer Model and Tracking DataYisheng Pei, Varuna De Silva, Mike Caine
Although the data-driven analysis of football players' performance has been developed for years, most research only focuses on the on-ball event including shots and passes, while the off-ball movement remains a little-explored area in this domain. Players' contributions to the whole match are evaluated unfairly, those who have more chances to score goals earn more credit than others, while the indirect and unnoticeable impact that comes from continuous movement has been ignored. This research presents a novel deep-learning network architecture which is capable to predict the potential end location of passes and how players' movement before the pass affects the final outcome. Once analysed more than 28,000 pass events, a robust prediction can be achieved with more than 0.7 Top-1 accuracy. And based on the prediction, a better understanding of the pitch control and pass option could be reached to measure players' off-ball movement contribution to defensive performance. Moreover, this model could provide football analysts a better tool and metric to understand how players' movement over time contributes to the game strategy and final victory.
CVAug 25, 2020
A Critical Analysis of Patch Similarity Based Image Denoising AlgorithmsVaruna De Silva
Image denoising is a classical signal processing problem that has received significant interest within the image processing community during the past two decades. Most of the algorithms for image denoising has focused on the paradigm of non-local similarity, where image blocks in the neighborhood that are similar, are collected to build a basis for reconstruction. Through rigorous experimentation, this paper reviews multiple aspects of image denoising algorithm development based on non-local similarity. Firstly, the concept of non-local similarity as a foundational quality that exists in natural images has not received adequate attention. Secondly, the image denoising algorithms that are developed are a combination of multiple building blocks, making comparison among them a tedious task. Finally, most of the work surrounding image denoising presents performance results based on Peak-Signal-to-Noise Ratio (PSNR) between a denoised image and a reference image (which is perturbed with Additive White Gaussian Noise). This paper starts with a statistical analysis on non-local similarity and its effectiveness under various noise levels, followed by a theoretical comparison of different state-of-the-art image denoising algorithms. Finally, we argue for a methodological overhaul to incorporate no-reference image quality measures and unprocessed images (raw) during performance evaluation of image denoising algorithms.
CVJan 9, 2019
Adaptive Feature Processing for Robust Human Activity Recognition on a Novel Multi-Modal DatasetMirco Moencks, Varuna De Silva, Jamie Roche et al.
Human Activity Recognition (HAR) is a key building block of many emerging applications such as intelligent mobility, sports analytics, ambient-assisted living and human-robot interaction. With robust HAR, systems will become more human-aware, leading towards much safer and empathetic autonomous systems. While human pose detection has made significant progress with the dawn of deep convolutional neural networks (CNNs), the state-of-the-art research has almost exclusively focused on a single sensing modality, especially video. However, in safety critical applications it is imperative to utilize multiple sensor modalities for robust operation. To exploit the benefits of state-of-the-art machine learning techniques for HAR, it is extremely important to have multimodal datasets. In this paper, we present a novel, multi-modal sensor dataset that encompasses nine indoor activities, performed by 16 participants, and captured by four types of sensors that are commonly used in indoor applications and autonomous vehicles. This multimodal dataset is the first of its kind to be made openly available and can be exploited for many applications that require HAR, including sports analytics, healthcare assistance and indoor intelligent mobility. We propose a novel data preprocessing algorithm to enable adaptive feature extraction from the dataset to be utilized by different machine learning algorithms. Through rigorous experimental evaluations, this paper reviews the performance of machine learning approaches to posture recognition, and analyses the robustness of the algorithms. When performing HAR with the RGB-Depth data from our new dataset, machine learning algorithms such as a deep neural network reached a mean accuracy of up to 96.8% for classification across all stationary and dynamic activities
CVOct 17, 2017
Robust Fusion of LiDAR and Wide-Angle Camera Data for Autonomous Mobile RobotsVaruna De Silva, Jamie Roche, Ahmet Kondoz
Autonomous robots that assist humans in day to day living tasks are becoming increasingly popular. Autonomous mobile robots operate by sensing and perceiving their surrounding environment to make accurate driving decisions. A combination of several different sensors such as LiDAR, radar, ultrasound sensors and cameras are utilized to sense the surrounding environment of autonomous vehicles. These heterogeneous sensors simultaneously capture various physical attributes of the environment. Such multimodality and redundancy of sensing need to be positively utilized for reliable and consistent perception of the environment through sensor data fusion. However, these multimodal sensor data streams are different from each other in many ways, such as temporal and spatial resolution, data format, and geometric alignment. For the subsequent perception algorithms to utilize the diversity offered by multimodal sensing, the data streams need to be spatially, geometrically and temporally aligned with each other. In this paper, we address the problem of fusing the outputs of a Light Detection and Ranging (LiDAR) scanner and a wide-angle monocular image sensor for free space detection. The outputs of LiDAR scanner and the image sensor are of different spatial resolutions and need to be aligned with each other. A geometrical model is used to spatially align the two sensor outputs, followed by a Gaussian Process (GP) regression-based resolution matching algorithm to interpolate the missing data with quantifiable uncertainty. The results indicate that the proposed sensor data fusion framework significantly aids the subsequent perception steps, as illustrated by the performance improvement of a uncertainty aware free space detection algorithm
MASep 14, 2017
An Agent-based Modelling Framework for Driving Policy Learning in Connected and Autonomous VehiclesVaruna De Silva, Xiongzhao Wang, Deniz Aladagli et al.
Due to the complexity of the natural world, a programmer cannot foresee all possible situations, a connected and autonomous vehicle (CAV) will face during its operation, and hence, CAVs will need to learn to make decisions autonomously. Due to the sensing of its surroundings and information exchanged with other vehicles and road infrastructure, a CAV will have access to large amounts of useful data. While different control algorithms have been proposed for CAVs, the benefits brought about by connectedness of autonomous vehicles to other vehicles and to the infrastructure, and its implications on policy learning has not been investigated in literature. This paper investigates a data driven driving policy learning framework through an agent-based modelling approaches. The contributions of the paper are two-fold. A dynamic programming framework is proposed for in-vehicle policy learning with and without connectivity to neighboring vehicles. The simulation results indicate that while a CAV can learn to make autonomous decisions, vehicle-to-vehicle (V2V) communication of information improves this capability. Furthermore, to overcome the limitations of sensing in a CAV, the paper proposes a novel concept for infrastructure-led policy learning and communication with autonomous vehicles. In infrastructure-led policy learning, road-side infrastructure senses and captures successful vehicle maneuvers and learns an optimal policy from those temporal sequences, and when a vehicle approaches the road-side unit, the policy is communicated to the CAV. Deep-imitation learning methodology is proposed to develop such an infrastructure-led policy learning framework.