Ryszard Kowalczyk

AI
h-index20
19papers
106citations
Novelty40%
AI Score53

19 Papers

LGJun 26, 2023Code
Few-Shot Continual Learning via Flat-to-Wide Approaches

Muhammad Anwar Ma'sum, Mahardhika Pratama, Edwin Lughofer et al.

Existing approaches on continual learning call for a lot of samples in their training processes. Such approaches are impractical for many real-world problems having limited samples because of the overfitting problem. This paper proposes a few-shot continual learning approach, termed FLat-tO-WidE AppRoach (FLOWER), where a flat-to-wide learning process finding the flat-wide minima is proposed to address the catastrophic forgetting problem. The issue of data scarcity is overcome with a data augmentation approach making use of a ball generator concept to restrict the sampling space into the smallest enclosing ball. Our numerical studies demonstrate the advantage of FLOWER achieving significantly improved performances over prior arts notably in the small base tasks. For further study, source codes of FLOWER, competitor algorithms and experimental logs are shared publicly in \url{https://github.com/anwarmaxsum/FLOWER}.

LGJul 30, 2024Code
PIP: Prototypes-Injected Prompt for Federated Class Incremental Learning

Muhammad Anwar Ma'sum, Mahardhika Pratama, Savitha Ramasamy et al.

Federated Class Incremental Learning (FCIL) is a new direction in continual learning (CL) for addressing catastrophic forgetting and non-IID data distribution simultaneously. Existing FCIL methods call for high communication costs and exemplars from previous classes. We propose a novel rehearsal-free method for FCIL named prototypes-injected prompt (PIP) that involves 3 main ideas: a) prototype injection on prompt learning, b) prototype augmentation, and c) weighted Gaussian aggregation on the server side. Our experiment result shows that the proposed method outperforms the current state of the arts (SOTAs) with a significant improvement (up to 33%) in CIFAR100, MiniImageNet and TinyImageNet datasets. Our extensive analysis demonstrates the robustness of PIP in different task sizes, and the advantage of requiring smaller participating local clients, and smaller global rounds. For further study, source codes of PIP, baseline, and experimental logs are shared publicly in https://github.com/anwarmaxsum/PIP.

64.6AIMay 29
HADT: A Heterogeneous Multi-Agent Differential Transformer for Autonomous Earth Observation Satellite Cluster

Mohamad A. Hady, Muhammad Anwar Masum, Siyi Hu et al.

This work addresses the problem of autonomous resource management in heterogeneous satellite cluster conducting Earth Observation (EO) missions including optical and Synthetic Aperture Radar (SAR) satellites. In autonomous operation mode, satellites are equipped with intelligent capabilities enabling real-time decision-making based on the latest conditions, while requiring minimal interaction with ground operators. Traditional scheduling approaches typically rely on mathematical models to represent satellite mission and resource management. Then, this problem is solved by using optimization algorithms. However, such solutions become less effective when the underlying models are not available, over complex, and inaccurate due to dynamic changes and uncertainties inherent in the space mission environment. A promising alternative is to reformulate the problem as a sequential decision-making process and apply model-free reinforcement learning techniques to enable adaptive and real-time resource management. To this end, we propose a novel transformer-based architecture tailored for heterogeneous satellite cluster autonomous EO Mission with relational observations-actions tokenization and differential attention mechanism. Our experimental results demonstrate significant performance improvements compared to the available baselines. Moreover, the proposed architecture exhibits strong adaptability and transferability with respect to varying numbers of satellite clusters.

37.7AIApr 8
KD-MARL: Resource-Aware Knowledge Distillation in Multi-Agent Reinforcement Learning

Monirul Islam Pavel, Siyi Hu, Muhammad Anwar Masum et al.

Real world deployment of multi agent reinforcement learning MARL systems is fundamentally constrained by limited compute memory and inference time. While expert policies achieve high performance they rely on costly decision cycles and large scale models that are impractical for edge devices or embedded platforms. Knowledge distillation KD offers a promising path toward resource aware execution but existing KD methods in MARL focus narrowly on action imitation often neglecting coordination structure and assuming uniform agent capabilities. We propose resource aware Knowledge Distillation for Multi Agent Reinforcement Learning KD MARL a two stage framework that transfers coordinated behavior from a centralized expert to lightweight decentralized student agents. The student policies are trained without a critic relying instead on distilled advantage signals and structured policy supervision to preserve coordination under heterogeneous and limited observations. Our approach transfers both action level behavior and structural coordination patterns from expert policies while supporting heterogeneous student architectures allowing each agent model capacity to match its observation complexity which is crucial for efficient execution under partial or limited observability and limited onboard resources. Extensive experiments on SMAC and MPE benchmarks demonstrate that KD MARL achieves high performance retention while substantially reducing computational cost. Across standard multi agent benchmarks KD MARL retains over 90 percent of expert performance while reducing computational cost by up to 28.6 times FLOPs. The proposed approach achieves expert level coordination and preserves it through structured distillation enabling practical MARL deployment across resource constrained onboard platforms.

NISep 17, 2024
Trends, Advancements and Challenges in Intelligent Optimization in Satellite Communication

Philippe Krajsic, Viola Suess, Zehong Cao et al.

Efficient satellite communications play an enormously important role in all of our daily lives. This includes the transmission of data for communication purposes, the operation of IoT applications or the provision of data for ground stations. More and more, AI-based methods are finding their way into these areas. This paper gives an overview of current research in the field of intelligent optimization of satellite communication. For this purpose, a text-mining based literature review was conducted and the identified papers were thematically clustered and analyzed. The identified clusters cover the main topics of routing, resource allocation and, load balancing. Through such a clustering of the literature in overarching topics, a structured analysis of the research papers was enabled, allowing the identification of latest technologies and approaches as well as research needs for intelligent optimization of satellite communication.

LGJul 16, 2025Code
PROL : Rehearsal Free Continual Learning in Streaming Data via Prompt Online Learning

M. Anwar Ma'sum, Mahardhika Pratama, Savitha Ramasamy et al.

The data privacy constraint in online continual learning (OCL), where the data can be seen only once, complicates the catastrophic forgetting problem in streaming data. A common approach applied by the current SOTAs in OCL is with the use of memory saving exemplars or features from previous classes to be replayed in the current task. On the other hand, the prompt-based approach performs excellently in continual learning but with the cost of a growing number of trainable parameters. The first approach may not be applicable in practice due to data openness policy, while the second approach has the issue of throughput associated with the streaming data. In this study, we propose a novel prompt-based method for online continual learning that includes 4 main components: (1) single light-weight prompt generator as a general knowledge, (2) trainable scaler-and-shifter as specific knowledge, (3) pre-trained model (PTM) generalization preserving, and (4) hard-soft updates mechanism. Our proposed method achieves significantly higher performance than the current SOTAs in CIFAR100, ImageNet-R, ImageNet-A, and CUB dataset. Our complexity analysis shows that our method requires a relatively smaller number of parameters and achieves moderate training time, inference time, and throughput. For further study, the source code of our method is available at https://github.com/anwarmaxsum/PROL.

CVJun 4, 2024Code
Unsupervised Few-Shot Continual Learning for Remote Sensing Image Scene Classification

Muhammad Anwar Ma'sum, Mahardhika Pratama, Ramasamy Savitha et al.

A continual learning (CL) model is desired for remote sensing image analysis because of varying camera parameters, spectral ranges, resolutions, etc. There exist some recent initiatives to develop CL techniques in this domain but they still depend on massive labelled samples which do not fully fit remote sensing applications because ground truths are often obtained via field-based surveys. This paper addresses this problem with a proposal of unsupervised flat-wide learning approach (UNISA) for unsupervised few-shot continual learning approaches of remote sensing image scene classifications which do not depend on any labelled samples for its model updates. UNISA is developed from the idea of prototype scattering and positive sampling for learning representations while the catastrophic forgetting problem is tackled with the flat-wide learning approach combined with a ball generator to address the data scarcity problem. Our numerical study with remote sensing image scene datasets and a hyperspectral dataset confirms the advantages of our solution. Source codes of UNISA are shared publicly in \url{https://github.com/anwarmaxsum/UNISA} to allow convenient future studies and reproductions of our numerical results.

LGJan 25, 2024Code
Dynamic Long-Term Time-Series Forecasting via Meta Transformer Networks

Muhammad Anwar Ma'sum, MD Rasel Sarkar, Mahardhika Pratama et al.

A reliable long-term time-series forecaster is highly demanded in practice but comes across many challenges such as low computational and memory footprints as well as robustness against dynamic learning environments. This paper proposes Meta-Transformer Networks (MANTRA) to deal with the dynamic long-term time-series forecasting tasks. MANTRA relies on the concept of fast and slow learners where a collection of fast learners learns different aspects of data distributions while adapting quickly to changes. A slow learner tailors suitable representations to fast learners. Fast adaptations to dynamic environments are achieved using the universal representation transformer layers producing task-adapted representations with a small number of parameters. Our experiments using four datasets with different prediction lengths demonstrate the advantage of our approach with at least $3\%$ improvements over the baseline algorithms for both multivariate and univariate settings. Source codes of MANTRA are publicly available in \url{https://github.com/anwarmaxsum/MANTRA}.

MAApr 29, 2025
Multi-Agent Reinforcement Learning for Resources Allocation Optimization: A Survey

Mohamad A. Hady, Siyi Hu, Mahardhika Pratama et al.

Multi-Agent Reinforcement Learning (MARL) has become a powerful framework for numerous real-world applications, modeling distributed decision-making and learning from interactions with complex environments. Resource Allocation Optimization (RAO) benefits significantly from MARL's ability to tackle dynamic and decentralized contexts. MARL-based approaches are increasingly applied to RAO challenges across sectors playing pivotal roles to Industry 4.0 developments. This survey provides a comprehensive review of recent MARL algorithms for RAO, encompassing core concepts, classifications, and a structured taxonomy. By outlining the current research landscape and identifying primary challenges and future directions, this survey aims to support researchers and practitioners in leveraging MARL's potential to advance resource allocation solutions.

LGMay 8, 2024
Few-Shot Class Incremental Learning via Robust Transformer Approach

Naeem Paeedeh, Mahardhika Pratama, Sunu Wibirama et al.

Few-Shot Class-Incremental Learning presents an extension of the Class Incremental Learning problem where a model is faced with the problem of data scarcity while addressing the catastrophic forgetting problem. This problem remains an open problem because all recent works are built upon the convolutional neural networks performing sub-optimally compared to the transformer approaches. Our paper presents Robust Transformer Approach built upon the Compact Convolution Transformer. The issue of overfitting due to few samples is overcome with the notion of the stochastic classifier, where the classifier's weights are sampled from a distribution with mean and variance vectors, thus increasing the likelihood of correct classifications, and the batch-norm layer to stabilize the training process. The issue of CF is dealt with the idea of delta parameters, small task-specific trainable parameters while keeping the backbone networks frozen. A non-parametric approach is developed to infer the delta parameters for the model's predictions. The prototype rectification approach is applied to avoid biased prototype calculations due to the issue of data scarcity. The advantage of ROBUSTA is demonstrated through a series of experiments in the benchmark problems where it is capable of outperforming prior arts with big margins without any data augmentation protocols.

ROJul 11, 2025
Intelligent Control of Spacecraft Reaction Wheel Attitude Using Deep Reinforcement Learning

Ghaith El-Dalahmeh, Mohammad Reza Jabbarpour, Bao Quoc Vo et al.

Reliable satellite attitude control is essential for the success of space missions, particularly as satellites increasingly operate autonomously in dynamic and uncertain environments. Reaction wheels (RWs) play a pivotal role in attitude control, and maintaining control resilience during RW faults is critical to preserving mission objectives and system stability. However, traditional Proportional Derivative (PD) controllers and existing deep reinforcement learning (DRL) algorithms such as TD3, PPO, and A2C often fall short in providing the real time adaptability and fault tolerance required for autonomous satellite operations. This study introduces a DRL-based control strategy designed to improve satellite resilience and adaptability under fault conditions. Specifically, the proposed method integrates Twin Delayed Deep Deterministic Policy Gradient (TD3) with Hindsight Experience Replay (HER) and Dimension Wise Clipping (DWC) referred to as TD3-HD to enhance learning in sparse reward environments and maintain satellite stability during RW failures. The proposed approach is benchmarked against PD control and leading DRL algorithms. Experimental results show that TD3-HD achieves significantly lower attitude error, improved angular velocity regulation, and enhanced stability under fault conditions. These findings underscore the proposed method potential as a powerful, fault tolerant, onboard AI solution for autonomous satellite attitude control.

LGMay 7, 2025
Onboard Optimization and Learning: A Survey

Monirul Islam Pavel, Siyi Hu, Mahardhika Pratama et al.

Onboard learning is a transformative approach in edge AI, enabling real-time data processing, decision-making, and adaptive model training directly on resource-constrained devices without relying on centralized servers. This paradigm is crucial for applications demanding low latency, enhanced privacy, and energy efficiency. However, onboard learning faces challenges such as limited computational resources, high inference costs, and security vulnerabilities. This survey explores a comprehensive range of methodologies that address these challenges, focusing on techniques that optimize model efficiency, accelerate inference, and support collaborative learning across distributed devices. Approaches for reducing model complexity, improving inference speed, and ensuring privacy-preserving computation are examined alongside emerging strategies that enhance scalability and adaptability in dynamic environments. By bridging advancements in hardware-software co-design, model compression, and decentralized learning, this survey provides insights into the current state of onboard learning to enable robust, efficient, and secure AI deployment at the edge.

AINov 16, 2025
Multi-Agent Reinforcement Learning for Heterogeneous Satellite Cluster Resources Optimization

Mohamad A. Hady, Siyi Hu, Mahardhika Pratama et al.

This work investigates resource optimization in heterogeneous satellite clusters performing autonomous Earth Observation (EO) missions using Reinforcement Learning (RL). In the proposed setting, two optical satellites and one Synthetic Aperture Radar (SAR) satellite operate cooperatively in low Earth orbit to capture ground targets and manage their limited onboard resources efficiently. Traditional optimization methods struggle to handle the real-time, uncertain, and decentralized nature of EO operations, motivating the use of RL and Multi-Agent Reinforcement Learning (MARL) for adaptive decision-making. This study systematically formulates the optimization problem from single-satellite to multi-satellite scenarios, addressing key challenges including energy and memory constraints, partial observability, and agent heterogeneity arising from diverse payload capabilities. Using a near-realistic simulation environment built on the Basilisk and BSK-RL frameworks, we evaluate the performance and stability of state-of-the-art MARL algorithms such as MAPPO, HAPPO, and HATRPO. Results show that MARL enables effective coordination across heterogeneous satellites, balancing imaging performance and resource utilization while mitigating non-stationarity and inter-agent reward coupling. The findings provide practical insights into scalable, autonomous satellite operations and contribute a foundation for future research on intelligent EO mission planning under heterogeneous and dynamic conditions.

LGOct 17, 2025
Continual Knowledge Consolidation LORA for Domain Incremental Learning

Naeem Paeedeh, Mahardhika Pratama, Weiping Ding et al.

Domain Incremental Learning (DIL) is a continual learning sub-branch that aims to address never-ending arrivals of new domains without catastrophic forgetting problems. Despite the advent of parameter-efficient fine-tuning (PEFT) approaches, existing works create task-specific LoRAs overlooking shared knowledge across tasks. Inaccurate selection of task-specific LORAs during inference results in significant drops in accuracy, while existing works rely on linear or prototype-based classifiers, which have suboptimal generalization powers. Our paper proposes continual knowledge consolidation low rank adaptation (CONEC-LoRA) addressing the DIL problems. CONEC-LoRA is developed from consolidations between task-shared LORA to extract common knowledge and task-specific LORA to embrace domain-specific knowledge. Unlike existing approaches, CONEC-LoRA integrates the concept of a stochastic classifier whose parameters are sampled from a distribution, thus enhancing the likelihood of correct classifications. Last but not least, an auxiliary network is deployed to optimally predict the task-specific LoRAs for inferences and implements the concept of a different-depth network structure in which every layer is connected with a local classifier to take advantage of intermediate representations. This module integrates the ball-generator loss and transformation module to address the synthetic sample bias problem. Our rigorous experiments demonstrate the advantage of CONEC-LoRA over prior arts in 4 popular benchmark problems with over 5% margins.

AIJul 14, 2025
Adaptability in Multi-Agent Reinforcement Learning: A Framework and Unified Review

Siyi Hu, Mohamad A Hady, Jianglin Qiao et al.

Multi-Agent Reinforcement Learning (MARL) has shown clear effectiveness in coordinating multiple agents across simulated benchmarks and constrained scenarios. However, its deployment in real-world multi-agent systems (MAS) remains limited, primarily due to the complex and dynamic nature of such environments. These challenges arise from multiple interacting sources of variability, including fluctuating agent populations, evolving task goals, and inconsistent execution conditions. Together, these factors demand that MARL algorithms remain effective under continuously changing system configurations and operational demands. To better capture and assess this capacity for adjustment, we introduce the concept of \textit{adaptability} as a unified and practically grounded lens through which to evaluate the reliability of MARL algorithms under shifting conditions, broadly referring to any changes in the environment dynamics that may occur during learning or execution. Centred on the notion of adaptability, we propose a structured framework comprising three key dimensions: learning adaptability, policy adaptability, and scenario-driven adaptability. By adopting this adaptability perspective, we aim to support more principled assessments of MARL performance beyond narrowly defined benchmarks. Ultimately, this survey contributes to the development of algorithms that are better suited for deployment in dynamic, real-world multi-agent systems.

AIJun 18, 2025
Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study

Mohamad A. Hady, Siyi Hu, Mahardhika Pratama et al.

The exponential growth of Low Earth Orbit (LEO) satellites has revolutionised Earth Observation (EO) missions, addressing challenges in climate monitoring, disaster management, and more. However, autonomous coordination in multi-satellite systems remains a fundamental challenge. Traditional optimisation approaches struggle to handle the real-time decision-making demands of dynamic EO missions, necessitating the use of Reinforcement Learning (RL) and Multi-Agent Reinforcement Learning (MARL). In this paper, we investigate RL-based autonomous EO mission planning by modelling single-satellite operations and extending to multi-satellite constellations using MARL frameworks. We address key challenges, including energy and data storage limitations, uncertainties in satellite observations, and the complexities of decentralised coordination under partial observability. By leveraging a near-realistic satellite simulation environment, we evaluate the training stability and performance of state-of-the-art MARL algorithms, including PPO, IPPO, MAPPO, and HAPPO. Our results demonstrate that MARL can effectively balance imaging and resource management while addressing non-stationarity and reward interdependency in multi-satellite coordination. The insights gained from this study provide a foundation for autonomous satellite operations, offering practical guidelines for improving policy learning in decentralised EO missions.

AIJan 23, 2013
On Quantified Linguistic Approximation

Ryszard Kowalczyk

Most fuzzy systems including fuzzy decision support and fuzzy control systems provide out-puts in the form of fuzzy sets that represent the inferred conclusions. Linguistic interpretation of such outputs often involves the use of linguistic approximation that assigns a linguistic label to a fuzzy set based on the predefined primary terms, linguistic modifiers and linguistic connectives. More generally, linguistic approximation can be formalized in the terms of the re-translation rules that correspond to the translation rules in ex-plicitation (e.g. simple, modifier, composite, quantification and qualification rules) in com-puting with words [Zadeh 1996]. However most existing methods of linguistic approximation use the simple, modifier and composite re-translation rules only. Although these methods can provide a sufficient approximation of simple fuzzy sets the approximation of more complex ones that are typical in many practical applications of fuzzy systems may be less satisfactory. Therefore the question arises why not use in linguistic ap-proximation also other re-translation rules corre-sponding to the translation rules in explicitation to advantage. In particular linguistic quantifica-tion may be desirable in situations where the conclusions interpreted as quantified linguistic propositions can be more informative and natu-ral. This paper presents some aspects of linguis-tic approximation in the context of the re-translation rules and proposes an approach to linguistic approximation with the use of quantifi-cation rules, i.e. quantified linguistic approxima-tion. Two methods of the quantified linguistic approximation are considered with the use of lin-guistic quantifiers based on the concepts of the non-fuzzy and fuzzy cardinalities of fuzzy sets. A number of examples are provided to illustrate the proposed approach.

AIJul 4, 2012
Efficient algorithm for estimation of qualitative expected utility in possibilistic case-based reasoning

Jakub Brzostowski, Ryszard Kowalczyk

We propose an efficient algorithm for estimation of possibility based qualitative expected utility. It is useful for decision making mechanisms where each possible decision is assigned a multi-attribute possibility distribution. The computational complexity of ordinary methods calculating the expected utility based on discretization is growing exponentially with the number of attributes, and may become infeasible with a high number of these attributes. We present series of theorems and lemmas proving the correctness of our algorithm that exibits a linear computational complexity. Our algorithm has been applied in the context of selecting the most prospective partners in multi-party multi-attribute negotiation, and can also be used in making decisions about potential offers during the negotiation as other similar problems.

AIFeb 14, 2012
An Efficient Protocol for Negotiation over Combinatorial Domains with Incomplete Information

Minyi Li, Quoc Bao Vo, Ryszard Kowalczyk

We study the problem of agent-based negotiation in combinatorial domains. It is difficult to reach optimal agreements in bilateral or multi-lateral negotiations when the agents' preferences for the possible alternatives are not common knowledge. Self-interested agents often end up negotiating inefficient agreements in such situations. In this paper, we present a protocol for negotiation in combinatorial domains which can lead rational agents to reach optimal agreements under incomplete information setting. Our proposed protocol enables the negotiating agents to identify efficient solutions using distributed search that visits only a small subspace of the whole outcome space. Moreover, the proposed protocol is sufficiently general that it is applicable to most preference representation models in combinatorial domains. We also present results of experiments that demonstrate the feasibility and computational efficiency of our approach.