LGJun 30, 2023Code
BuildingsBench: A Large-Scale Dataset of 900K Buildings and Benchmark for Short-Term Load ForecastingPatrick Emami, Abhijeet Sahu, Peter Graf
Short-term forecasting of residential and commercial building energy consumption is widely used in power systems and continues to grow in importance. Data-driven short-term load forecasting (STLF), although promising, has suffered from a lack of open, large-scale datasets with high building diversity. This has hindered exploring the pretrain-then-fine-tune paradigm for STLF. To help address this, we present BuildingsBench, which consists of: 1) Buildings-900K, a large-scale dataset of 900K simulated buildings representing the U.S. building stock; and 2) an evaluation platform with over 1,900 real residential and commercial buildings from 7 open datasets. BuildingsBench benchmarks two under-explored tasks: zero-shot STLF, where a pretrained model is evaluated on unseen buildings without fine-tuning, and transfer learning, where a pretrained model is fine-tuned on a target building. The main finding of our benchmark analysis is that synthetically pretrained models generalize surprisingly well to real commercial buildings. An exploration of the effect of increasing dataset size and diversity on zero-shot commercial building performance reveals a power-law with diminishing returns. We also show that fine-tuning pretrained models on real commercial and residential buildings improves performance for a majority of target buildings. We hope that BuildingsBench encourages and facilitates future research on generalizable STLF. All datasets and code can be accessed from https://github.com/NREL/BuildingsBench.
CRMar 19
Network and Device Level Cyber Deception for Contested Environments Using RL and LLMsAbhijeet Sahu, Shuva Paul, Richard Macwan
Cyber deception assists in increasing the attacker's budget in reconnaissance or any early phases of threat intrusions. In the past, numerous methods of cyber deception have been adopted, such as IP address randomization, the creation of honeypots and honeynets mimicking an actual set of services, and networks deployed within an enterprise or operational technology(OT) network. These types of strategies follow naive approaches of recreating services that are expensive and that need a lot of human intervention. The advent of cloud services and other automations of containerized applications, such as Kubernetes, makes cyber defense easier. Yet, there remains a lot of potential to improve the accuracy of these deception strategies and to make them cost-effective using artificial intelligence (AI)-based solutions by making the deception more dynamic. Hence, in this work, we review various AI-based solutions in building network- and device-level cyber deception methods in contested environments. Specifically, we focus on leveraging the fusion of large language models (LLMs) and reinforcement learning(RL) in optimally learning these cyber deception strategies and validating the efficacy of such strategies in some stealthy attacks against OT systems in the literature.
CRSep 6, 2024
Detection of False Data Injection Attacks (FDIA) on Power Dynamical Systems With a State Prediction MethodAbhijeet Sahu, Truc Nguyen, Kejun Chen et al.
With the deeper penetration of inverter-based resources in power systems, false data injection attacks (FDIA) are a growing cyber-security concern. They have the potential to disrupt the system's stability like frequency stability, thereby leading to catastrophic failures. Therefore, an FDIA detection method would be valuable to protect power systems. FDIAs typically induce a discrepancy between the desired and the effective behavior of the power system dynamics. A suitable detection method can leverage power dynamics predictions to identify whether such a discrepancy was induced by an FDIA. This work investigates the efficacy of temporal and spatio-temporal state prediction models, such as Long Short-Term Memory (LSTM) and a combination of Graph Neural Networks (GNN) with LSTM, for predicting frequency dynamics in the absence of an FDIA but with noisy measurements, and thereby identify FDIA events. For demonstration purposes, the IEEE 39 New England Kron-reduced model simulated with a swing equation is considered. It is shown that the proposed state prediction models can be used as a building block for developing an effective FDIA detection method that can maintain high detection accuracy across various attack and deployment settings. It is also shown how the FDIA detection should be deployed to limit its exposure to detection inaccuracies and mitigate its computational burden.
LGAug 30, 2024
Discovery of False Data Injection Schemes on Frequency Controllers with Reinforcement LearningRomesh Prasad, Malik Hassanaly, Xiangyu Zhang et al.
While inverter-based distributed energy resources (DERs) play a crucial role in integrating renewable energy into the power system, they concurrently diminish the grid's system inertia, elevating the risk of frequency instabilities. Furthermore, smart inverters, interfaced via communication networks, pose a potential vulnerability to cyber threats if not diligently managed. To proactively fortify the power grid against sophisticated cyber attacks, we propose to employ reinforcement learning (RL) to identify potential threats and system vulnerabilities. This study concentrates on analyzing adversarial strategies for false data injection, specifically targeting smart inverters involved in primary frequency control. Our findings demonstrate that an RL agent can adeptly discern optimal false data injection methods to manipulate inverter settings, potentially causing catastrophic consequences.
DCMar 12
Beyond BFS: A Comparative Study of Rooted Spanning Tree Algorithms on GPUsAbhijeet Sahu, Srikar Vilas Donur
Rooted spanning trees (RSTs) are a core primitive in parallel graph analytics, underpinning algorithms such as biconnected components and planarity testing. On GPUs, RST construction has traditionally relied on breadth-first search (BFS) due to its simplicity and work efficiency. However, BFS incurs an O(D) step complexity, which severely limits parallelism on high-diameter and power-law graphs. We present a comparative study of alternative RST construction strategies on modern GPUs. We introduce a GPU adaptation of the Path Reversal RST (PR-RST) algorithm, optimizing its pointer-jumping and broadcast operations for modern GPU architecture. In addition, we evaluate an integrated approach that combines a state-of-the-art connectivity framework (GConn) with Eulerian tour-based rooting. Across more than 10 real-world graphs, our results show that the GConn-based approach achieves up to 300x speedup over optimized BFS on high-diameter graphs. These findings indicate that the O(log n) step complexity of connectivity-based methods can outweigh their structural overhead on modern hardware, motivating a rethinking of RST construction in GPU graph analytics.
LGMay 25, 2023
Bayesian Reinforcement Learning for Automatic Voltage Control under Cyber-Induced UncertaintyAbhijeet Sahu, Katherine Davis
Voltage control is crucial to large-scale power system reliable operation, as timely reactive power support can help prevent widespread outages. However, there is currently no built in mechanism for power systems to ensure that the voltage control objective to maintain reliable operation will survive or sustain the uncertainty caused under adversary presence. Hence, this work introduces a Bayesian Reinforcement Learning (BRL) approach for power system control problems, with focus on sustained voltage control under uncertainty in a cyber-adversarial environment. This work proposes a data-driven BRL-based approach for automatic voltage control by formulating and solving a Partially-Observable Markov Decision Problem (POMDP), where the states are partially observable due to cyber intrusions. The techniques are evaluated on the WSCC and IEEE 14 bus systems. Additionally, BRL techniques assist in automatically finding a threshold for exploration and exploitation in various RL techniques.
AINov 20, 2021
Inter-Domain Fusion for Enhanced Intrusion Detection in Power Systems: An Evidence Theoretic and Meta-Heuristic ApproachAbhijeet Sahu, Katherine Davis
False alerts due to misconfigured/ compromised IDS in ICS networks can lead to severe economic and operational damage. To solve this problem, research has focused on leveraging deep learning techniques that help reduce false alerts. However, a shortcoming is that these works often require or implicitly assume the physical and cyber sensors to be trustworthy. Implicit trust of data is a major problem with using artificial intelligence or machine learning for CPS security, because during critical attack detection time they are more at risk, with greater likelihood and impact, of also being compromised. To address this shortcoming, the problem is reframed on how to make good decisions given uncertainty. Then, the decision is detection, and the uncertainty includes whether the data used for ML-based IDS is compromised. Thus, this work presents an approach for reducing false alerts in CPS power systems by dealing uncertainty without the knowledge of prior distribution of alerts. Specifically, an evidence theoretic based approach leveraging Dempster Shafer combination rules are proposed for reducing false alerts. A multi-hypothesis mass function model is designed that leverages probability scores obtained from various supervised-learning classifiers. Using this model, a location-cum-domain based fusion framework is proposed and evaluated with different combination rules, that fuse multiple evidence from inter-domain and intra-domain sensors. The approach is demonstrated in a cyber-physical power system testbed with Man-In-The-Middle attack emulation in a large-scale synthetic electric grid. For evaluating the performance, plausibility, belief, pignistic, etc. metrics as decision functions are considered. To improve the performance, a multi-objective based genetic algorithm is proposed for feature selection considering the decision metrics as the fitness function.
SPApr 5, 2021
Graph Neural Networks Based Detection of Stealth False Data Injection Attacks in Smart GridsOsman Boyaci, Amarachi Umunnakwe, Abhijeet Sahu et al.
False data injection attacks (FDIAs) represent a major class of attacks that aim to break the integrity of measurements by injecting false data into the smart metering devices in power grids. To the best of authors' knowledge, no study has attempted to design a detector that automatically models the underlying graph topology and spatially correlated measurement data of the smart grids to better detect cyber attacks. The contributions of this paper to detect and mitigate FDIAs are twofold. First, we present a generic, localized, and stealth (unobservable) attack generation methodology and publicly accessible datasets for researchers to develop and test their algorithms. Second, we propose a Graph Neural Network (GNN) based, scalable and real-time detector of FDIAs that efficiently combines model-driven and data-driven approaches by incorporating the inherent physical connections of modern AC power grids and exploiting the spatial correlations of the measurement. It is experimentally verified by comparing the proposed GNN based detector with the currently available FDIA detectors in the literature that our algorithm outperforms the best available solutions by 3.14%, 4.25%, and 4.41% in F1 score for standard IEEE testbeds with 14, 118, and 300 buses, respectively.
CRFeb 23, 2021
Man-in-The-Middle Attacks and Defense in a Power System Cyber-Physical TestbedPatrick Wlazlo, Abhijeet Sahu, Zeyu Mao et al.
Man-in-The-Middle (MiTM) attacks present numerous threats to a smart grid. In a MiTM attack, an intruder embeds itself within a conversation between two devices to either eavesdrop or impersonate one of the devices, making it appear to be a normal exchange of information. Thus, the intruder can perform false data injection (FDI) and false command injection (FCI) attacks that can compromise power system operations, such as state estimation, economic dispatch, and automatic generation control (AGC). Very few researchers have focused on MiTM methods that are difficult to detect within a smart grid. To address this, we are designing and implementing multi-stage MiTM intrusions in an emulation-based cyber-physical power system testbed against a large-scale synthetic grid model to demonstrate how such attacks can cause physical contingencies such as misguided operation and false measurements. MiTM intrusions create FCI, FDI, and replay attacks in this synthetic power grid. This work enables stakeholders to defend against these stealthy attacks, and we present detection mechanisms that are developed using multiple alerts from intrusion detection systems and network monitoring tools. Our contribution will enable other smart grid security researchers and industry to develop further detection mechanisms for inconspicuous MiTM attacks.
LGJan 18, 2021
Multi-Source Data Fusion for Cyberattack Detection in Power SystemsAbhijeet Sahu, Zeyu Mao, Patrick Wlazlo et al.
Cyberattacks can cause a severe impact on power systems unless detected early. However, accurate and timely detection in critical infrastructure systems presents challenges, e.g., due to zero-day vulnerability exploitations and the cyber-physical nature of the system coupled with the need for high reliability and resilience of the physical system. Conventional rule-based and anomaly-based intrusion detection system (IDS) tools are insufficient for detecting zero-day cyber intrusions in the industrial control system (ICS) networks. Hence, in this work, we show that fusing information from multiple data sources can help identify cyber-induced incidents and reduce false positives. Specifically, we present how to recognize and address the barriers that can prevent the accurate use of multiple data sources for fusion-based detection. We perform multi-source data fusion for training IDS in a cyber-physical power system testbed where we collect cyber and physical side data from multiple sensors emulating real-world data sources that would be found in a utility and synthesizes these into features for algorithms to detect intrusions. Results are presented using the proposed data fusion application to infer False Data and Command injection-based Man-in- The-Middle (MiTM) attacks. Post collection, the data fusion application uses time-synchronized merge and extracts features followed by pre-processing such as imputation and encoding before training supervised, semi-supervised, and unsupervised learning models to evaluate the performance of the IDS. A major finding is the improvement of detection accuracy by fusion of features from cyber, security, and physical domains. Additionally, we observed the co-training technique performs at par with supervised learning methods when fed with our features.