55.0DCJun 2
Brief Announcement: Generative Markov Model for Distributed Computing SystemsAlfreds Lapkovskis, Ali Beikmohammadi, Sindri Magnússon et al.
Emerging distributed computing paradigms, such as the computing continuum, are inherently heterogeneous, stochastic, and complex. Efficiently and effectively utilizing all available resources across the continuum demands a unified formal model of the system. To address this gap, we propose a general framework for modeling distributed computing systems as a generative Markov model, factorized over a structured system state. In our model, the state decomposes into high-dimensional variables, each further factorized over its elements, reflecting the sparse dependency structure inherent to distributed systems. This yields a tractable model enabling simulation, inference, and policy learning over otherwise intractable system states, bridging distributed computing with Markov chain theory and reinforcement learning (RL). We demonstrate our framework through a case study of collaborative AI inference, in which a dedicated server combines resources with those volunteered by service users. Our results show that centralized scheduling becomes a bottleneck at scale, while distributing computation across user devices reduces both latency and server resource consumption. These findings highlight the value of adaptive decision-making in distributed computing systems and demonstrate the framework's utility for modeling, simulation, and optimization.
CYDec 26, 2025
Socio-technical aspects of Agentic AIPraveen Kumar Donta, Alaa Saleh, Ying Li et al.
Agentic Artificial Intelligence (AI) represents a fundamental shift in the design of intelligent systems, characterized by interconnected components that collectively enable autonomous perception, reasoning, planning, action, and learning. Recent research on agentic AI has largely focused on technical foundations, including system architectures, reasoning and planning mechanisms, coordination strategies, and application-level performance across domains. However, the societal, ethical, economic, environmental, and governance implications of agentic AI remain weakly integrated into these technical treatments. This paper addresses this gap by presenting a socio-technical analysis of agentic AI that explicitly connects core technical components with societal context. We examine how architectural choices in perception, cognition, planning, execution, and memory introduce dependencies related to data governance, accountability, transparency, safety, and sustainability. To structure this analysis, we adopt the MAD-BAD-SAD construct as an analytical lens, capturing motivations, applications, and moral dilemmas (MAD); biases, accountability, and dangers (BAD); and societal impact, adoption, and design considerations (SAD). Using this lens, we analyze ethical considerations, implications, and challenges arising from contemporary agentic AI systems and assess their manifestation across emerging applications, including healthcare, education, industry, smart and sustainable cities, social services, communications and networking, and earth observation and satellite communications. The paper further identifies open challenges and suggests future research directions, framing agentic AI as an integrated socio-technical system whose behavior and impact are co-produced by algorithms, data, organizational practices, regulatory frameworks, and social norms.
LGDec 23, 2022Code
NARS vs. Reinforcement learning: ONA vs. Q-LearningAli Beikmohammadi
One of the realistic scenarios is taking a sequence of optimal actions to do a task. Reinforcement learning is the most well-known approach to deal with this kind of task in the machine learning community. Finding a suitable alternative could always be an interesting and out-of-the-box matter. Therefore, in this project, we are looking to investigate the capability of NARS and answer the question of whether NARS has the potential to be a substitute for RL or not. Particularly, we are making a comparison between $Q$-Learning and ONA on some environments developed by an Open AI gym. The source code for the experiments is publicly available in the following link: \url{https://github.com/AliBeikmohammadi/OpenNARS-for-Applications/tree/master/misc/Python}.
LGDec 23, 2022Code
Using MM principles to deal with incomplete data in K-means clusteringAli Beikmohammadi
Among many clustering algorithms, the K-means clustering algorithm is widely used because of its simple algorithm and fast convergence. However, this algorithm suffers from incomplete data, where some samples have missed some of their attributes. To solve this problem, we mainly apply MM principles to restore the symmetry of the data, so that K-means could work well. We give the pseudo-code of the algorithm and use the standard datasets for experimental verification. The source code for the experiments is publicly available in the following link: \url{https://github.com/AliBeikmohammadi/MM-Optimization/blob/main/mini-project/MM%20K-means.ipynb}.
LGFeb 28, 2023
Human-Inspired Framework to Accelerate Reinforcement LearningAli Beikmohammadi, Sindri Magnússon
Reinforcement learning (RL) is crucial for data science decision-making but suffers from sample inefficiency, particularly in real-world scenarios with costly physical interactions. This paper introduces a novel human-inspired framework to enhance RL algorithm sample efficiency. It achieves this by initially exposing the learning agent to simpler tasks that progressively increase in complexity, ultimately leading to the main task. This method requires no pre-training and involves learning simpler tasks for just one iteration. The resulting knowledge can facilitate various transfer learning approaches, such as value and policy transfer, without increasing computational complexity. It can be applied across different goals, environments, and RL algorithms, including value-based, policy-based, tabular, and deep RL methods. Experimental evaluations demonstrate the framework's effectiveness in enhancing sample efficiency, especially in challenging main tasks, demonstrated through both a simple Random Walk and more complex optimal control problems with constraints.
LGMar 17, 2023
Comparing NARS and Reinforcement Learning: An Analysis of ONA and $Q$-Learning AlgorithmsAli Beikmohammadi, Sindri Magnússon
In recent years, reinforcement learning (RL) has emerged as a popular approach for solving sequence-based tasks in machine learning. However, finding suitable alternatives to RL remains an exciting and innovative research area. One such alternative that has garnered attention is the Non-Axiomatic Reasoning System (NARS), which is a general-purpose cognitive reasoning framework. In this paper, we delve into the potential of NARS as a substitute for RL in solving sequence-based tasks. To investigate this, we conduct a comparative analysis of the performance of ONA as an implementation of NARS and $Q$-Learning in various environments that were created using the Open AI gym. The environments have different difficulty levels, ranging from simple to complex. Our results demonstrate that NARS is a promising alternative to RL, with competitive performance in diverse environments, particularly in non-deterministic ones.
27.2LGApr 9
PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPCMohsen Amiri, Mohsen Amiri, Ali Beikmohammadi et al.
This paper addresses the problem of training a reinforcement learning (RL) policy under partial observability by exploiting a privileged, anytime-feasible planner agent available exclusively during training. We formalize this as a Partially Observable Markov Decision Process (POMDP) in which a planner agent with access to an approximate dynamical model and privileged state information guides a learning agent that observes only a lossy projection of the true state. To realize this framework, we introduce an anytime-feasible Model Predictive Control (MPC) algorithm that serves as the planner agent. For the learning agent, we propose Planner-to-Policy Soft Actor-Critic (P2P-SAC), a method that distills the planner agent's privileged knowledge to mitigate partial observability and thereby improve both sample efficiency and final policy performance. We support this framework with rigorous theoretical analysis. Finally, we validate our approach in simulation using NVIDIA Isaac Lab and successfully deploy it on a real-world Unitree Go2 quadruped navigating complex, obstacle-rich environments.
DCMar 26, 2024
Compressed Federated Reinforcement Learning with a Generative ModelAli Beikmohammadi, Sarit Khirirat, Sindri Magnússon
Reinforcement learning has recently gained unprecedented popularity, yet it still grapples with sample inefficiency. Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations. However, this aggregation step incurs significant communication costs. In this paper, we propose CompFedRL, a communication-efficient FedRL approach incorporating both \textit{periodic aggregation} and (direct/error-feedback) compression mechanisms. Specifically, we consider compressed federated $Q$-learning with a generative model setup, where a central server learns an optimal $Q$-function by periodically aggregating compressed $Q$-estimates from local agents. For the first time, we characterize the impact of these two mechanisms (which have remained elusive) by providing a finite-time analysis of our algorithm, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression. Our bounds indicate improved solution accuracy concerning the number of agents and other federated hyperparameters while simultaneously reducing communication costs. To corroborate our theory, we also conduct in-depth numerical experiments to verify our findings, considering Top-$K$ and Sparsified-$K$ sparsification operators.
LGFeb 29, 2024
On the Convergence of Federated Learning Algorithms without Data SimilarityAli Beikmohammadi, Sarit Khirirat, Sindri Magnússon
Data similarity assumptions have traditionally been relied upon to understand the convergence behaviors of federated learning methods. Unfortunately, this approach often demands fine-tuning step sizes based on the level of data similarity. When data similarity is low, these small step sizes result in an unacceptably slow convergence speed for federated methods. In this paper, we present a novel and unified framework for analyzing the convergence of federated learning algorithms without the need for data similarity conditions. Our analysis centers on an inequality that captures the influence of step sizes on algorithmic convergence performance. By applying our theorems to well-known federated algorithms, we derive precise expressions for three widely used step size schedules: fixed, diminishing, and step-decay step sizes, which are independent of data similarity conditions. Finally, we conduct comprehensive evaluations of the performance of these federated learning algorithms, employing the proposed step size strategies to train deep neural network models on benchmark datasets under varying data similarity conditions. Our findings demonstrate significant improvements in convergence speed and overall performance, marking a substantial advancement in federated learning research.
LGFeb 29, 2024
Parallel Momentum Methods Under Biased Gradient EstimationsAli Beikmohammadi, Sarit Khirirat, Sindri Magnússon
Parallel stochastic gradient methods are gaining prominence in solving large-scale machine learning problems that involve data distributed across multiple nodes. However, obtaining unbiased stochastic gradients, which have been the focus of most theoretical research, is challenging in many distributed machine learning applications. The gradient estimations easily become biased, for example, when gradients are compressed or clipped, when data is shuffled, and in meta-learning and reinforcement learning. In this work, we establish worst-case bounds on parallel momentum methods under biased gradient estimation on both general non-convex and $μ$-PL problems. Our analysis covers general distributed optimization problems, and we work out the implications for special cases where gradient estimates are biased, i.e. in meta-learning and when the gradients are compressed or clipped. Our numerical experiments verify our theoretical findings and show faster convergence performance of momentum methods than traditional biased gradient descent.
LGMar 21, 2025
Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference AnalysisAli Beikmohammadi, Sarit Khirirat, Peter Richtárik et al.
Federated reinforcement learning (FedRL) enables collaborative learning while preserving data privacy by preventing direct data exchange between agents. However, many existing FedRL algorithms assume that all agents operate in identical environments, which is often unrealistic. In real-world applications, such as multi-robot teams, crowdsourced systems, and large-scale sensor networks, each agent may experience slightly different transition dynamics, leading to inherent model mismatches. In this paper, we first establish linear convergence guarantees for single-agent temporal difference learning (TD(0)) in policy evaluation and demonstrate that under a perturbed environment, the agent suffers a systematic bias that prevents accurate estimation of the true value function. This result holds under both i.i.d. and Markovian sampling regimes. We then extend our analysis to the federated TD(0) (FedTD(0)) setting, where multiple agents, each interacting with its own perturbed environment, periodically share value estimates to collaboratively approximate the true value function of a common underlying model. Our theoretical results indicate the impact of model mismatch, network connectivity, and mixing behavior on the convergence of FedTD(0). Empirical experiments corroborate our theoretical gains, highlighting that even moderate levels of information sharing significantly mitigate environment-specific errors.
CVJun 3, 2024
Automatic Fused Multimodal Deep Learning for Plant IdentificationAlfreds Lapkovskis, Natalia Nefedova, Ali Beikmohammadi
Plant classification is vital for ecological conservation and agricultural productivity, enhancing our understanding of plant growth dynamics and aiding species preservation. The advent of deep learning (DL) techniques has revolutionized this field by enabling autonomous feature extraction, significantly reducing the dependence on manual expertise. However, conventional DL models often rely solely on single data sources, failing to capture the full biological diversity of plant species comprehensively. Recent research has turned to multimodal learning to overcome this limitation by integrating multiple data types, which enriches the representation of plant characteristics. This shift introduces the challenge of determining the optimal point for modality fusion. In this paper, we introduce a pioneering multimodal DL-based approach for plant classification with automatic modality fusion. Utilizing the multimodal fusion architecture search, our method integrates images from multiple plant organs -- flowers, leaves, fruits, and stems -- into a cohesive model. To address the lack of multimodal datasets, we contributed Multimodal-PlantCLEF, a restructured version of the PlantCLEF2015 dataset tailored for multimodal tasks. Our method achieves 82.61% accuracy on 979 classes of Multimodal-PlantCLEF, surpassing state-of-the-art methods and outperforming late fusion by 10.33%. Through the incorporation of multimodal dropout, our approach demonstrates strong robustness to missing modalities. We validate our model against established benchmarks using standard performance metrics and McNemar's test, further underscoring its superiority.
LGJan 16, 2024
A Cost-Sensitive Transformer Model for Prognostics Under Highly Imbalanced Industrial DataAli Beikmohammadi, Mohammad Hosein Hamian, Neda Khoeyniha et al.
The rapid influx of data-driven models into the industrial sector has been facilitated by the proliferation of sensor technology, enabling the collection of vast quantities of data. However, leveraging these models for failure detection and prognosis poses significant challenges, including issues like missing values and class imbalances. Moreover, the cost sensitivity associated with industrial operations further complicates the application of conventional models in this context. This paper introduces a novel cost-sensitive transformer model developed as part of a systematic workflow, which also integrates a hybrid resampler and a regression-based imputer. After subjecting our approach to rigorous testing using the APS failure dataset from Scania trucks and the SECOM dataset, we observed a substantial enhancement in performance compared to state-of-the-art methods. Moreover, we conduct an ablation study to analyze the contributions of different components in our proposed method. Our findings highlight the potential of our method in addressing the unique challenges of failure prediction in industrial settings, thereby contributing to enhanced reliability and efficiency in industrial operations.
CVSep 10, 2020
SWP-LeafNET: A novel multistage approach for plant leaf identification based on deep CNNAli Beikmohammadi, Karim Faez, Ali Motallebi
Modern scientific and technological advances allow botanists to use computer vision-based approaches for plant identification tasks. These approaches have their own challenges. Leaf classification is a computer-vision task performed for the automated identification of plant species, a serious challenge due to variations in leaf morphology, including its size, texture, shape, and venation. Researchers have recently become more inclined toward deep learning-based methods rather than conventional feature-based methods due to the popularity and successful implementation of deep learning methods in image analysis, object recognition, and speech recognition. In this paper, to have an interpretable and reliable system, a botanist's behavior is modeled in leaf identification by proposing a highly-efficient method of maximum behavioral resemblance developed through three deep learning-based models. Different layers of the three models are visualized to ensure that the botanist's behavior is modeled accurately. The first and second models are designed from scratch. Regarding the third model, the pre-trained architecture MobileNetV2 is employed along with the transfer-learning technique. The proposed method is evaluated on two well-known datasets: Flavia and MalayaKew. According to a comparative analysis, the suggested approach is more accurate than hand-crafted feature extraction methods and other deep learning techniques in terms of 99.67% and 99.81% accuracy. Unlike conventional techniques that have their own specific complexities and depend on datasets, the proposed method requires no hand-crafted feature extraction. Also, it increases accuracy as compared with other deep learning techniques. Moreover, SWP-LeafNET is distributable and considerably faster than other methods because of using shallower models with fewer parameters asynchronously.