Aaron Tuor

SY
h-index27
23papers
897citations
Novelty52%
AI Score49

23 Papers

SYMay 22, 2022
Neural Lyapunov Differentiable Predictive Control

Sayak Mukherjee, Ján Drgoňa, Aaron Tuor et al.

We present a learning-based predictive control methodology using the differentiable programming framework with probabilistic Lyapunov-based stability guarantees. The neural Lyapunov differentiable predictive control (NLDPC) learns the policy by constructing a computational graph encompassing the system dynamics, state and input constraints, and the necessary Lyapunov certification constraints, and thereafter using the automatic differentiation to update the neural policy parameters. In conjunction, our approach jointly learns a Lyapunov function that certifies the regions of state-space with stable dynamics. We also provide a sampling-based statistical guarantee for the training of NLDPC from the distribution of initial conditions. Our offline training approach provides a computationally efficient and scalable alternative to classical explicit model predictive control solutions. We substantiate the advantages of the proposed approach with simulations to stabilize the double integrator model and on an example of controlling an aircraft model.

DSJul 11, 2022
Structural Inference of Networked Dynamical Systems with Universal Differential Equations

James Koch, Zhao Chen, Aaron Tuor et al.

Networked dynamical systems are common throughout science in engineering; e.g., biological networks, reaction networks, power systems, and the like. For many such systems, nonlinearity drives populations of identical (or near-identical) units to exhibit a wide range of nontrivial behaviors, such as the emergence of coherent structures (e.g., waves and patterns) or otherwise notable dynamics (e.g., synchrony and chaos). In this work, we seek to infer (i) the intrinsic physics of a base unit of a population, (ii) the underlying graphical structure shared between units, and (iii) the coupling physics of a given networked dynamical system given observations of nodal states. These tasks are formulated around the notion of the Universal Differential Equation, whereby unknown dynamical systems can be approximated with neural networks, mathematical terms known a priori (albeit with unknown parameterizations), or combinations of the two. We demonstrate the value of these inference tasks by investigating not only future state predictions but also the inference of system behavior on varied network topologies. The effectiveness and utility of these methods is shown with their application to canonical networked nonlinear coupled oscillators.

LGMar 2, 2022
Learning Stochastic Parametric Differentiable Predictive Control Policies

Ján Drgoňa, Sayak Mukherjee, Aaron Tuor et al.

The problem of synthesizing stochastic explicit model predictive control policies is known to be quickly intractable even for systems of modest complexity when using classical control-theoretic methods. To address this challenge, we present a scalable alternative called stochastic parametric differentiable predictive control (SP-DPC) for unsupervised learning of neural control policies governing stochastic linear systems subject to nonlinear chance constraints. SP-DPC is formulated as a deterministic approximation to the stochastic parametric constrained optimal control problem. This formulation allows us to directly compute the policy gradients via automatic differentiation of the problem's value function, evaluated over sampled parameters and uncertainties. In particular, the computed expectation of the SP-DPC problem's value function is backpropagated through the closed-loop system rollouts parametrized by a known nominal system dynamics model and neural control policy which allows for direct model-based policy optimization. We provide theoretical probabilistic guarantees for policies learned via the SP-DPC method on closed-loop stability and chance constraints satisfaction. Furthermore, we demonstrate the computational efficiency and scalability of the proposed policy optimization algorithm in three numerical examples, including systems with a large number of states or subject to nonlinear constraints.

SYAug 3, 2022
Differentiable Predictive Control with Safety Guarantees: A Control Barrier Function Approach

Wenceslao Shaw Cortez, Jan Drgona, Aaron Tuor et al.

We develop a novel form of differentiable predictive control (DPC) with safety and robustness guarantees based on control barrier functions. DPC is an unsupervised learning-based method for obtaining approximate solutions to explicit model predictive control (MPC) problems. In DPC, the predictive control policy parametrized by a neural network is optimized offline via direct policy gradients obtained by automatic differentiation of the MPC problem. The proposed approach exploits a new form of sampled-data barrier function to enforce offline and online safety requirements in DPC settings while only interrupting the neural network-based controller near the boundary of the safe set. The effectiveness of the proposed approach is demonstrated in simulation.

DSAug 15, 2022
Domain-aware Control-oriented Neural Models for Autonomous Underwater Vehicles

Wenceslao Shaw Cortez, Soumya Vasisht, Aaron Tuor et al.

Conventional physics-based modeling is a time-consuming bottleneck in control design for complex nonlinear systems like autonomous underwater vehicles (AUVs). In contrast, purely data-driven models, though convenient and quick to obtain, require a large number of observations and lack operational guarantees for safety-critical systems. Data-driven models leveraging available partially characterized dynamics have potential to provide reliable systems models in a typical data-limited scenario for high value complex systems, thereby avoiding months of expensive expert modeling time. In this work we explore this middle-ground between expert-modeled and pure data-driven modeling. We present control-oriented parametric models with varying levels of domain-awareness that exploit known system structure and prior physics knowledge to create constrained deep neural dynamical system models. We employ universal differential equations to construct data-driven blackbox and graybox representations of the AUV dynamics. In addition, we explore a hybrid formulation that explicitly models the residual error related to imperfect graybox models. We compare the prediction performance of the learned models for different distributions of initial conditions and control inputs to assess their accuracy, generalization, and suitability for control.

SYMar 20, 2022
Neuro-physical dynamic load modeling using differentiable parametric optimization

Shrirang Abhyankar, Jan Drgona, Andrew August et al.

In this work, we investigate a data-driven approach for obtaining a reduced equivalent load model of distribution systems for electromechanical transient stability analysis. The proposed reduced equivalent is a neuro-physical model comprising of a traditional ZIP load model augmented with a neural network. This neuro-physical model is trained through differentiable programming. We discuss the formulation, modeling details, and training of the proposed model set up as a differential parametric program. The performance and accuracy of this neurophysical ZIP load model is presented on a medium-scale 350-bus transmission-distribution network.

LGNov 11, 2025
Homotopy-Guided Self-Supervised Learning of Parametric Solutions for AC Optimal Power Flow

Shimiao Li, Aaron Tuor, Draguna Vrabie et al.

Learning to optimize (L2O) parametric approximations of AC optimal power flow (AC-OPF) solutions offers the potential for fast, reusable decision-making in real-time power system operations. However, the inherent nonconvexity of AC-OPF results in challenging optimization landscapes, and standard learning approaches often fail to converge to feasible, high-quality solutions. This work introduces a \textit{homotopy-guided self-supervised L2O method} for parametric AC-OPF problems. The key idea is to construct a continuous deformation of the objective and constraints during training, beginning from a relaxed problem with a broad basin of attraction and gradually transforming it toward the original problem. The resulting learning process improves convergence stability and promotes feasibility without requiring labeled optimal solutions or external solvers. We evaluate the proposed method on standard IEEE AC-OPF benchmarks and show that homotopy-guided L2O significantly increases feasibility rates compared to non-homotopy baselines, while achieving objective values comparable to full OPF solvers. These findings demonstrate the promise of homotopy-based heuristics for scalable, constraint-aware L2O in power system optimization.

SYMar 30
Data Center Chiller Plant Optimization via Mixed-Integer Nonlinear Differentiable Predictive Control

Ján Boldocký, Cary Faulkner, Elad Michael et al.

We present a computationally tractable framework for real-time predictive control of multi-chiller plants that involve both discrete and continuous control decisions coupled through nonlinear dynamics, resulting in a mixed-integer optimal control problem. To address this challenge, we extend Differentiable Predictive Control (DPC) -- a self-supervised, model-based learning methodology for approximately solving parametric optimal control problems -- to accommodate mixed-integer control policies. We benchmark the proposed framework against a state-of-the-art Model Predictive Control (MPC) solver and a fast heuristic Rule-Based Controller (RBC). Simulation results demonstrate that our approach achieves significant energy savings over the RBC while maintaining orders-of-magnitude faster computation times than MPC, offering a scalable and practical alternative to conventional combinatorial mixed-integer control formulations.

AIApr 14
Dead Cognitions: A Census of Misattributed Insights

Aaron Tuor, claude. ai

This essay identifies a failure mode of AI chat systems that we term attribution laundering: the model performs substantive cognitive work and then rhetorically credits the user for having generated the resulting insights. Unlike transparent versions of glad handing sycophancy, attribution laundering is systematically occluded to the person it affects and self-reinforcing -- eroding users' ability to accurately assess their own cognitive contributions over time. We trace the mechanisms at both individual and societal scales, from the chat interface that discourages scrutiny to the institutional pressures that reward adoption over accountability. The document itself is an artifact of the process it describes, and is color-coded accordingly -- though the views expressed are the authors' own, not those of any affiliated institution, and the boundary between the human author's views and Claude's is, as the essay argues, difficult to draw.

LGFeb 28, 2022
Neural Ordinary Differential Equations for Nonlinear System Identification

Aowabin Rahman, Ján Drgoňa, Aaron Tuor et al.

Neural ordinary differential equations (NODE) have been recently proposed as a promising approach for nonlinear system identification tasks. In this work, we systematically compare their predictive performance with current state-of-the-art nonlinear and classical linear methods. In particular, we present a quantitative study comparing NODE's performance against neural state-space models and classical linear system identification methods. We evaluate the inference speed and prediction performance of each method on open-loop errors across eight different dynamical systems. The experiments show that NODEs can consistently improve the prediction accuracy by an order of magnitude compared to benchmark methods. Besides improved accuracy, we also observed that NODEs are less sensitive to hyperparameters compared to neural state-space models. On the other hand, these performance gains come with a slight increase of computation at the inference time.

SYJul 25, 2021
Deep Learning Explicit Differentiable Predictive Control Laws for Buildings

Jan Drgona, Aaron Tuor, Soumya Vasisht et al.

We present a differentiable predictive control (DPC) methodology for learning constrained control laws for unknown nonlinear systems. DPC poses an approximate solution to multiparametric programming problems emerging from explicit nonlinear model predictive control (MPC). Contrary to approximate MPC, DPC does not require supervision by an expert controller. Instead, a system dynamics model is learned from the observed system's dynamics, and the neural control law is optimized offline by leveraging the differentiable closed-loop system model. The combination of a differentiable closed-loop system and penalty methods for constraint handling of system outputs and inputs allows us to optimize the control law's parameters directly by backpropagating economic MPC loss through the learned system model. The control performance of the proposed DPC method is demonstrated in simulation using learned model of multi-zone building thermal dynamics.

CVApr 8, 2021
Prototypical Region Proposal Networks for Few-Shot Localization and Classification

Elliott Skomski, Aaron Tuor, Andrew Avila et al.

Recently proposed few-shot image classification methods have generally focused on use cases where the objects to be classified are the central subject of images. Despite success on benchmark vision datasets aligned with this use case, these methods typically fail on use cases involving densely-annotated, busy images: images common in the wild where objects of relevance are not the central subject, instead appearing potentially occluded, small, or among other incidental objects belonging to other classes of potential interest. To localize relevant objects, we employ a prototype-based few-shot segmentation model which compares the encoded features of unlabeled query images with support class centroids to produce region proposals indicating the presence and location of support set classes in a query image. These region proposals are then used as additional conditioning input to few-shot image classifiers. We develop a framework to unify the two stages (segmentation and classification) into an end-to-end classification model -- PRoPnet -- and empirically demonstrate that our methods improve accuracy on image datasets with natural scenes containing multiple object classes.

DSJan 6, 2021
Constrained Block Nonlinear Neural Dynamical Models

Elliott Skomski, Soumya Vasisht, Colby Wight et al.

Neural network modules conditioned by known priors can be effectively trained and combined to represent systems with nonlinear dynamics. This work explores a novel formulation for data-efficient learning of deep control-oriented nonlinear dynamical models by embedding local model structure and constraints. The proposed method consists of neural network blocks that represent input, state, and output dynamics with constraints placed on the network weights and system variables. For handling partially observable dynamical systems, we utilize a state observer neural network to estimate the states of the system's latent dynamics. We evaluate the performance of the proposed architecture and training methods on system identification tasks for three nonlinear systems: a continuous stirred tank reactor, a two tank interacting system, and an aerodynamics body. Models optimized with a few thousand system state observations accurately represent system dynamics in open loop simulation over thousands of time steps from a single set of initial conditions. Experimental results demonstrate an order of magnitude reduction in open-loop simulation mean squared error for our constrained, block-structured neural models when compared to traditional unstructured and unconstrained neural network models.

NENov 26, 2020
Physics-Informed Neural State Space Models via Learning and Evolution

Elliott Skomski, Jan Drgona, Aaron Tuor

Recent works exploring deep learning application to dynamical systems modeling have demonstrated that embedding physical priors into neural networks can yield more effective, physically-realistic, and data-efficient models. However, in the absence of complete prior knowledge of a dynamical system's physical characteristics, determining the optimal structure and optimization strategy for these models can be difficult. In this work, we explore methods for discovering neural state space dynamics models for system identification. Starting with a design space of block-oriented state space models and structured linear maps with strong physical priors, we encode these components into a model genome alongside network structure, penalty constraints, and optimization hyperparameters. Demonstrating the overall utility of the design space, we employ an asynchronous genetic search algorithm that alternates between model selection and optimization and obtains accurate physically consistent models of three physical systems: an aerodynamics body, a continuous stirred tank reactor, and a two tank interacting system.

LGNov 26, 2020
Dissipative Deep Neural Dynamical Systems

Jan Drgona, Soumya Vasisht, Aaron Tuor et al.

In this paper, we provide sufficient conditions for dissipativity and local asymptotic stability of discrete-time dynamical systems parametrized by deep neural networks. We leverage the representation of neural networks as pointwise affine maps, thus exposing their local linear operators and making them accessible to classical system analytic and design methods. This allows us to "crack open the black box" of the neural dynamical system's behavior by evaluating their dissipativity, and estimating their stationary points and state-space partitioning. We relate the norms of these local linear operators to the energy stored in the dissipative system with supply rates represented by their aggregate bias terms. Empirically, we analyze the variance in dynamical behavior and eigenvalue spectra of these local linear operators with varying weight factorizations, activation functions, bias terms, and depths.

LGSep 23, 2020
Fuzzy Simplicial Networks: A Topology-Inspired Model to Improve Task Generalization in Few-shot Learning

Henry Kvinge, Zachary New, Nico Courts et al.

Deep learning has shown great success in settings with massive amounts of data but has struggled when data is limited. Few-shot learning algorithms, which seek to address this limitation, are designed to generalize well to new tasks with limited data. Typically, models are evaluated on unseen classes and datasets that are defined by the same fundamental task as they are trained for (e.g. category membership). One can also ask how well a model can generalize to fundamentally different tasks within a fixed dataset (for example: moving from category membership to tasks that involve detecting object orientation or quantity). To formalize this kind of shift we define a notion of "independence of tasks" and identify three new sets of labels for established computer vision datasets that test a model's ability to generalize to tasks which draw on orthogonal attributes in the data. We use these datasets to investigate the failure modes of metric-based few-shot models. Based on our findings, we introduce a new few-shot model called Fuzzy Simplicial Networks (FSN) which leverages a construction from topology to more flexibly represent each class from limited data. In particular, FSN models can not only form multiple representations for a given class but can also begin to capture the low-dimensional structure which characterizes class manifolds in the encoded space of deep networks. We show that FSN outperforms state-of-the-art models on the challenging tasks we introduce in this paper while remaining competitive on standard few-shot benchmarks.

CVApr 24, 2020
Systematic Evaluation of Backdoor Data Poisoning Attacks on Image Classifiers

Loc Truong, Chace Jones, Brian Hutchinson et al.

Backdoor data poisoning attacks have recently been demonstrated in computer vision research as a potential safety risk for machine learning (ML) systems. Traditional data poisoning attacks manipulate training data to induce unreliability of an ML model, whereas backdoor data poisoning attacks maintain system performance unless the ML model is presented with an input containing an embedded "trigger" that provides a predetermined response advantageous to the adversary. Our work builds upon prior backdoor data-poisoning research for ML image classifiers and systematically assesses different experimental conditions including types of trigger patterns, persistence of trigger patterns during retraining, poisoning strategies, architectures (ResNet-50, NasNet, NasNet-Mobile), datasets (Flowers, CIFAR-10), and potential defensive regularization techniques (Contrastive Loss, Logit Squeezing, Manifold Mixup, Soft-Nearest-Neighbors Loss). Experiments yield four key findings. First, the success rate of backdoor poisoning attacks varies widely, depending on several factors, including model architecture, trigger pattern and regularization technique. Second, we find that poisoned models are hard to detect through performance inspection alone. Third, regularization typically reduces backdoor success rate, although it can have no effect or even slightly increase it, depending on the form of regularization. Finally, backdoors inserted through data poisoning can be rendered ineffective after just a few epochs of additional training on a small set of clean data without affecting the model's performance.

SYApr 23, 2020
Learning Constrained Adaptive Differentiable Predictive Control Policies With Guarantees

Jan Drgona, Aaron Tuor, Draguna Vrabie

We present differentiable predictive control (DPC), a method for learning constrained neural control policies for linear systems with probabilistic performance guarantees. We employ automatic differentiation to obtain direct policy gradients by backpropagating the model predictive control (MPC) loss function and constraints penalties through a differentiable closed-loop system dynamics model. We demonstrate that the proposed method can learn parametric constrained control policies to stabilize systems with unstable dynamics, track time-varying references, and satisfy nonlinear state and input constraints. In contrast with imitation learning-based approaches, our method does not depend on a supervisory controller. Most importantly, we demonstrate that, without losing performance, our method is scalable and computationally more efficient than implicit, explicit, and approximate MPC. Under review at IEEE Transactions on Automatic Control.

SYApr 22, 2020
Constrained Neural Ordinary Differential Equations with Stability Guarantees

Aaron Tuor, Jan Drgona, Draguna Vrabie

Differential equations are frequently used in engineering domains, such as modeling and control of industrial systems, where safety and performance guarantees are of paramount importance. Traditional physics-based modeling approaches require domain expertise and are often difficult to tune or adapt to new systems. In this paper, we show how to model discrete ordinary differential equations (ODE) with algebraic nonlinearities as deep neural networks with varying degrees of prior knowledge. We derive the stability guarantees of the network layers based on the implicit constraints imposed on the weight's eigenvalues. Moreover, we show how to use barrier methods to generically handle additional inequality constraints. We demonstrate the prediction accuracy of learned neural ODEs evaluated on open-loop simulations compared to ground truth dynamics with bi-linear terms.

IRFeb 17, 2019
Multiple Document Representations from News Alerts for Automated Bio-surveillance Event Detection

Aaron Tuor, Fnu Anubhav, Lauren Charles

Due to globalization, geographic boundaries no longer serve as effective shields for the spread of infectious diseases. In order to aid bio-surveillance analysts in disease tracking, recent research has been devoted to developing information retrieval and analysis methods utilizing the vast corpora of publicly available documents on the internet. In this work, we present methods for the automated retrieval and classification of documents related to active public health events. We demonstrate classification performance on an auto-generated corpus, using recurrent neural network, TF-IDF, and Naive Bayes log count ratio document representations. By jointly modeling the title and description of a document, we achieve 97% recall and 93.3% accuracy with our best performing bio-surveillance event classification model: logistic regression on the combined output from a pair of bidirectional recurrent neural networks.

LGMar 13, 2018
Recurrent Neural Network Attention Mechanisms for Interpretable System Log Anomaly Detection

Andy Brown, Aaron Tuor, Brian Hutchinson et al.

Deep learning has recently demonstrated state-of-the art performance on key tasks related to the maintenance of computer systems, such as intrusion detection, denial of service attack detection, hardware and software system failures, and malware detection. In these contexts, model interpretability is vital for administrator and analyst to trust and act on the automated analysis of machine learning models. Deep learning methods have been criticized as black box oracles which allow limited insight into decision factors. In this work we seek to "bridge the gap" between the impressive performance of deep learning models and the need for interpretable model introspection. To this end we present recurrent neural network (RNN) language models augmented with attention for anomaly detection in system logs. Our methods are generally applicable to any computer system and logging source. By incorporating attention variants into our RNN language models we create opportunities for model introspection and analysis without sacrificing state-of-the art performance. We demonstrate model performance and illustrate model interpretability on an intrusion detection task using the Los Alamos National Laboratory (LANL) cyber security dataset, reporting upward of 0.99 area under the receiver operator characteristic curve despite being trained only on a single day's worth of data.

NEDec 2, 2017
Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection

Aaron Tuor, Ryan Baerwolf, Nicolas Knowles et al.

Automated analysis methods are crucial aids for monitoring and defending a network to protect the sensitive or confidential data it hosts. This work introduces a flexible, powerful, and unsupervised approach to detecting anomalous behavior in computer and network logs, one that largely eliminates domain-dependent feature engineering employed by existing methods. By treating system logs as threads of interleaved "sentences" (event log lines) to train online unsupervised neural network language models, our approach provides an adaptive model of normal network behavior. We compare the effectiveness of both standard and bidirectional recurrent neural network language models at detecting malicious activity within network log data. Extending these models, we introduce a tiered recurrent architecture, which provides context by modeling sequences of users' actions over time. Compared to Isolation Forest and Principal Components Analysis, two popular anomaly detection algorithms, we observe superior performance on the Los Alamos National Laboratory Cyber Security dataset. For log-line-level red team detection, our best performing character-based model provides test set area under the receiver operator characteristic curve of 0.98, demonstrating the strong fine-grained anomaly detection performance of this approach on open vocabulary logging sources.

NEOct 2, 2017
Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams

Aaron Tuor, Samuel Kaplan, Brian Hutchinson et al.

Analysis of an organization's computer network activity is a key component of early detection and mitigation of insider threat, a growing concern for many organizations. Raw system logs are a prototypical example of streaming data that can quickly scale beyond the cognitive power of a human analyst. As a prospective filter for the human analyst, we present an online unsupervised deep learning approach to detect anomalous network activity from system logs in real time. Our models decompose anomaly scores into the contributions of individual user behavior features for increased interpretability to aid analysts reviewing potential cases of insider threat. Using the CERT Insider Threat Dataset v6.2 and threat detection recall as our performance metric, our novel deep and recurrent neural network models outperform Principal Component Analysis, Support Vector Machine and Isolation Forest based anomaly detection baselines. For our best model, the events labeled as insider threat activity in our dataset had an average anomaly score in the 95.53 percentile, demonstrating our approach's potential to greatly reduce analyst workloads.