Nagi Gebraeel

LG
h-index38
13papers
63citations
Novelty56%
AI Score53

13 Papers

LGSep 12, 2024
Wasserstein Distributionally Robust Multiclass Support Vector Machine

Michael Ibrahim, Heraldo Rozas, Nagi Gebraeel

We study the problem of multiclass classification for settings where data features $\mathbf{x}$ and their labels $\mathbf{y}$ are uncertain. We identify that distributionally robust one-vs-all (OVA) classifiers often struggle in settings with imbalanced data. To address this issue, we use Wasserstein distributionally robust optimization to develop a robust version of the multiclass support vector machine (SVM) characterized by the Crammer-Singer (CS) loss. First, we prove that the CS loss is bounded from above by a Lipschitz continuous function for all $\mathbf{x} \in \mathcal{X}$ and $\mathbf{y} \in \mathcal{Y}$, then we exploit strong duality results to express the dual of the worst-case risk problem, and we show that the worst-case risk minimization problem admits a tractable convex reformulation due to the regularity of the CS loss. Moreover, we develop a kernel version of our proposed model to account for nonlinear class separation, and we show that it admits a tractable convex upper bound. We also propose a projected subgradient method algorithm for a special case of our proposed linear model to improve scalability. Our numerical experiments demonstrate that our model outperforms state-of-the art OVA models in settings where the training data is highly imbalanced. We also show through experiments on popular real-world datasets that our proposed model often outperforms its regularized counterpart as the first accounts for uncertain labels unlike the latter.

LGFeb 13
Uncertainty in Federated Granger Causality: From Origins to Systemic Consequences

Ayush Mohanty, Nazal Mohamed, Nagi Gebraeel

Granger Causality (GC) provides a rigorous framework for learning causal structures from time-series data. Recent federated variants of GC have targeted distributed infrastructure applications (e.g., smart grids) with distributed clients that generate high-dimensional data bound by data-sovereignty constraints. However, Federated GC algorithms only yield deterministic point estimates of causality and neglect uncertainty. This paper establishes the first methodology for rigorously quantifying uncertainty and its propagation within federated GC frameworks. We systematically classify sources of uncertainty, explicitly differentiating aleatoric (data noise) from epistemic (model variability) effects. We derive closed-form recursions that model the evolution of uncertainty through client-server interactions and identify four novel cross-covariance components that couple data uncertainties with model parameter uncertainties across the federated architecture. We also define rigorous convergence conditions for these uncertainty recursions and obtain explicit steady-state variances for both server and client model parameters. Our convergence analysis demonstrates that steady-state variances depend exclusively on client data statistics, thus eliminating dependence on initial epistemic priors and enhancing robustness. Empirical evaluations on synthetic benchmarks and real-world industrial datasets demonstrate that explicitly characterizing uncertainty significantly improves the reliability and interpretability of federated causal inference.

LGJan 29
A Federated Generalized Expectation-Maximization Algorithm for Mixture Models with an Unknown Number of Components

Michael Ibrahim, Nagi Gebraeel, Weijun Xie

We study the problem of federated clustering when the total number of clusters $K$ across clients is unknown, and the clients have heterogeneous but potentially overlapping cluster sets in their local data. To that end, we develop FedGEM: a federated generalized expectation-maximization algorithm for the training of mixture models with an unknown number of components. Our proposed algorithm relies on each of the clients performing EM steps locally, and constructing an uncertainty set around the maximizer associated with each local component. The central server utilizes the uncertainty sets to learn potential cluster overlaps between clients, and infer the global number of clusters via closed-form computations. We perform a thorough theoretical study of our algorithm, presenting probabilistic convergence guarantees under common assumptions. Subsequently, we study the specific setting of isotropic GMMs, providing tractable, low-complexity computations to be performed by each client during each iteration of the algorithm, as well as rigorously verifying assumptions required for algorithm convergence. We perform various numerical experiments, where we empirically demonstrate that our proposed method achieves comparable performance to centralized EM, and that it outperforms various existing federated clustering methods.

LGFeb 23
Federated Causal Representation Learning in State-Space Systems for Decentralized Counterfactual Reasoning

Nazal Mohamed, Ayush Mohanty, Nagi Gebraeel

Networks of interdependent industrial assets (clients) are tightly coupled through physical processes and control inputs, raising a key question: how would the output of one client change if another client were operated differently? This is difficult to answer because client-specific data are high-dimensional and private, making centralization of raw data infeasible. Each client also maintains proprietary local models that cannot be modified. We propose a federated framework for causal representation learning in state-space systems that captures interdependencies among clients under these constraints. Each client maps high-dimensional observations into low-dimensional latent states that disentangle intrinsic dynamics from control-driven influences. A central server estimates the global state-transition and control structure. This enables decentralized counterfactual reasoning where clients predict how outputs would change under alternative control inputs at others while only exchanging compact latent states. We prove convergence to a centralized oracle and provide privacy guarantees. Our experiments demonstrate scalability, and accurate cross-client counterfactual inference on synthetic and real-world industrial control system datasets.

LGFeb 13
Federated Learning of Nonlinear Temporal Dynamics with Graph Attention-based Cross-Client Interpretability

Ayse Tursucular, Ayush Mohanty, Nazal Mohamed et al.

Networks of modern industrial systems are increasingly monitored by distributed sensors, where each system comprises multiple subsystems generating high dimensional time series data. These subsystems are often interdependent, making it important to understand how temporal patterns at one subsystem relate to others. This is challenging in decentralized settings where raw measurements cannot be shared and client observations are heterogeneous. In practical deployments each subsystem (client) operates a fixed proprietary model that cannot be modified or retrained, limiting existing approaches. Nonlinear dynamics further make cross client temporal interdependencies difficult to interpret because they are embedded in nonlinear state transition functions. We present a federated framework for learning temporal interdependencies across clients under these constraints. Each client maps high dimensional local observations to low dimensional latent states using a nonlinear state space model. A central server learns a graph structured neural state transition model over the communicated latent states using a Graph Attention Network. For interpretability we relate the Jacobian of the learned server side transition model to attention coefficients, providing the first interpretable characterization of cross client temporal interdependencies in decentralized nonlinear systems. We establish theoretical convergence guarantees to a centralized oracle and validate the framework through synthetic experiments demonstrating convergence, interpretability, scalability and privacy. Additional real world experiments show performance comparable to decentralized baselines.

LGFeb 25
Learning Unknown Interdependencies for Decentralized Root Cause Analysis in Nonlinear Dynamical Systems

Ayush Mohanty, Paritosh Ramanan, Nagi Gebraeel

Root cause analysis (RCA) in networked industrial systems, such as supply chains and power networks, is notoriously difficult due to unknown and dynamically evolving interdependencies among geographically distributed clients. These clients represent heterogeneous physical processes and industrial assets equipped with sensors that generate large volumes of nonlinear, high-dimensional, and heterogeneous IoT data. Classical RCA methods require partial or full knowledge of the system's dependency graph, which is rarely available in these complex networks. While federated learning (FL) offers a natural framework for decentralized settings, most existing FL methods assume homogeneous feature spaces and retrainable client models. These assumptions are not compatible with our problem setting. Different clients have different data features and often run fixed, proprietary models that cannot be modified. This paper presents a federated cross-client interdependency learning methodology for feature-partitioned, nonlinear time-series data, without requiring access to raw sensor streams or modifying proprietary client models. Each proprietary local client model is augmented with a Machine Learning (ML) model that encodes cross-client interdependencies. These ML models are coordinated via a global server that enforces representation consistency while preserving privacy through calibrated differential privacy noise. RCA is performed using model residuals and anomaly flags. We establish theoretical convergence guarantees and validate our approach on extensive simulations and a real-world industrial cybersecurity dataset.

LGJan 23, 2025
Federated Granger Causality Learning for Interdependent Clients with State Space Representation

Ayush Mohanty, Nazal Mohamed, Paritosh Ramanan et al.

Advanced sensors and IoT devices have improved the monitoring and control of complex industrial enterprises. They have also created an interdependent fabric of geographically distributed process operations (clients) across these enterprises. Granger causality is an effective approach to detect and quantify interdependencies by examining how one client's state affects others over time. Understanding these interdependencies captures how localized events, such as faults and disruptions, can propagate throughout the system, possibly causing widespread operational impacts. However, the large volume and complexity of industrial data pose challenges in modeling these interdependencies. This paper develops a federated approach to learning Granger causality. We utilize a linear state space system framework that leverages low-dimensional state estimates to analyze interdependencies. This addresses bandwidth limitations and the computational burden commonly associated with centralized data processing. We propose augmenting the client models with the Granger causality information learned by the server through a Machine Learning (ML) function. We examine the co-dependence between the augmented client and server models and reformulate the framework as a standalone ML algorithm providing conditions for its sublinear and linear convergence rates. We also study the convergence of the framework to a centralized oracle model. Moreover, we include a differential privacy analysis to ensure data security while preserving causal insights. Using synthetic data, we conduct comprehensive experiments to demonstrate the robustness of our approach to perturbations in causality, the scalability to the size of communication, number of clients, and the dimensions of raw data. We also evaluate the performance on two real-world industrial control system datasets by reporting the volume of data saved by decentralization.

RONov 30, 2024
Prognostic Framework for Robotic Manipulators Operating Under Dynamic Task Severities

Ayush Mohanty, Jason Dekarske, Stephen K. Robinson et al.

Robotic manipulators are critical in many applications but are known to degrade over time. This degradation is influenced by the nature of the tasks performed by the robot. Tasks with higher severity, such as handling heavy payloads, can accelerate the degradation process. One way this degradation is reflected is in the position accuracy of the robot's end-effector. In this paper, we present a prognostic modeling framework that predicts a robotic manipulator's Remaining Useful Life (RUL) while accounting for the effects of task severity. Our framework represents the robot's position accuracy as a Brownian motion process with a random drift parameter that is influenced by task severity. The dynamic nature of task severity is modeled using a continuous-time Markov chain (CTMC). To evaluate RUL, we discuss two approaches -- (1) a novel closed-form expression for Remaining Lifetime Distribution (RLD), and (2) Monte Carlo simulations, commonly used in prognostics literature. Theoretical results establish the equivalence between these RUL computation approaches. We validate our framework through experiments using two distinct physics-based simulators for planar and spatial robot fleets. Our findings show that robots in both fleets experience shorter RUL when handling a higher proportion of high-severity tasks.

MLNov 19, 2024
Sensor-fusion based Prognostics for Deep-space Habitats Exhibiting Multiple Unlabeled Failure Modes

Benjamin Peters, Ayush Mohanty, Xiaolei Fang et al.

Deep-space habitats are complex systems that must operate autonomously over extended durations without ground-based maintenance. These systems are vulnerable to multiple, often unknown, failure modes that affect different subsystems and sensors in mode-specific ways. Developing accurate remaining useful life (RUL) prognostics is challenging, especially when failure labels are unavailable and sensor relevance varies by failure mode. In this paper, we propose an unsupervised prognostics framework that jointly identifies latent failure modes and selects informative sensors using only unlabeled training data. The methodology consists of two phases. In the offline phase, we model system failure times using a mixture of Gaussian regressions and apply a novel Expectation-Maximization algorithm to cluster degradation trajectories and select mode-specific sensors. In the online phase, we extract low-dimensional features from the selected sensors to diagnose the active failure mode and predict RUL using a weighted regression model. We demonstrate the effectiveness of our approach on a simulated dataset that reflects deep-space telemetry characteristics and on a real-world engine degradation dataset, showing improved accuracy and interpretability over existing methods.

APFeb 22, 2021
An Online Approach to Cyberattack Detection and Localization in Smart Grid

Dan Li, Nagi Gebraeel, Kamran Paynabar et al.

Complex interconnections between information technology and digital control systems have significantly increased cybersecurity vulnerabilities in smart grids. Cyberattacks involving data integrity can be very disruptive because of their potential to compromise physical control by manipulating measurement data. This is especially true in large and complex electric networks that often rely on traditional intrusion detection systems focused on monitoring network traffic. In this paper, we develop an online detection algorithm to detect and localize covert attacks on smart grids. Using a network system model, we develop a theoretical framework by characterizing a covert attack on a generator bus in the network as sparse features in the state-estimation residuals. We leverage such sparsity via a regularized linear regression method to detect and localize covert attacks based on the regression coefficients. We conduct a comprehensive numerical study on both linear and nonlinear system models to validate our proposed method. The results show that our method outperforms conventional methods in both detection delay and localization accuracy.

DCOct 18, 2020
Decentralized and Secure Generation Maintenance with Differential Privacy

Paritosh Ramanan, Murat Yildirim, Nagi Gebraeel et al.

Decentralized methods are gaining popularity for data-driven models in power systems as they offer significant computational scalability while guaranteeing full data ownership by utility stakeholders. However, decentralized methods still require sharing information about network flow estimates over public facing communication channels, which raises privacy concerns. In this paper we propose a differential privacy driven approach geared towards decentralized formulations of mixed integer operations and maintenance optimization problems that protects network flow estimates. We prove strong privacy guarantees by leveraging the linear relationship between the phase angles and the flow. To address the challenges associated with the mixed integer and dynamic nature of the problem, we introduce an exponential moving average based consensus mechanism to enhance convergence, coupled with a control chart based convergence criteria to improve stability. Our experimental results obtained on the IEEE 118 bus case demonstrate that our privacy preserving approach yields solution qualities on par with benchmark methods without differential privacy. To demonstrate the computational robustness of our method, we conduct experiments using a wide range of noise levels and operational scenarios.

LGSep 25, 2020
Deep Learning based Covert Attack Identification for Industrial Control Systems

Dan Li, Paritosh Ramanan, Nagi Gebraeel et al.

Cybersecurity of Industrial Control Systems (ICS) is drawing significant concerns as data communication increasingly leverages wireless networks. A lot of data-driven methods were developed for detecting cyberattacks, but few are focused on distinguishing them from equipment faults. In this paper, we develop a data-driven framework that can be used to detect, diagnose, and localize a type of cyberattack called covert attacks on smart grids. The framework has a hybrid design that combines an autoencoder, a recurrent neural network (RNN) with a Long-Short-Term-Memory (LSTM) layer, and a Deep Neural Network (DNN). This data-driven framework considers the temporal behavior of a generic physical system that extracts features from the time series of the sensor measurements that can be used for detecting covert attacks, distinguishing them from equipment faults, as well as localize the attack/fault. We evaluate the performance of the proposed method through a realistic simulation study on the IEEE 14-bus model as a typical example of ICS. We compare the performance of the proposed method with the traditional model-based method to show its applicability and efficacy.

MLSep 1, 2015
Multi-Sensor Slope Change Detection

Yang Cao, Yao Xie, Nagi Gebraeel

We develop a mixture procedure for multi-sensor systems to monitor data streams for a change-point that causes a gradual degradation to a subset of the streams. Observations are assumed to be initially normal random variables with known constant means and variances. After the change-point, observations in the subset will have increasing or decreasing means. The subset and the rate-of-changes are unknown. Our procedure uses a mixture statistics, which assumes that each sensor is affected by the change-point with probability $p_0$. Analytic expressions are obtained for the average run length (ARL) and the expected detection delay (EDD) of the mixture procedure, which are demonstrated to be quite accurate numerically. We establish the asymptotic optimality of the mixture procedure. Numerical examples demonstrate the good performance of the proposed procedure. We also discuss an adaptive mixture procedure using empirical Bayes. This paper extends our earlier work on detecting an abrupt change-point that causes a mean-shift, by tackling the challenges posed by the non-stationarity of the slope-change problem.