Brandon Paulsen

h-index7

7papers

114citations

Novelty66%

AI Score45

Ranked #42,136 of 194,257 authors (top 22%)#9,802 in LG (top 24%)

7 Papers

8.0SEOct 23, 2025

Automated Cloud Infrastructure-as-Code Reconciliation with AI Agents

Zhenning Yang, Hui Guan, Victor Nicolet et al.

Cloud infrastructure is managed through a mix of interfaces -- traditionally, cloud consoles, command-line interfaces (CLI), and SDKs are the tools of choice. Recently, Infrastructure-as-Code/IaC frameworks (e.g., Terraform) have quickly gained popularity. Unlike conventional tools, IaC~frameworks encode the infrastructure in a "source-of-truth" configuration. They are capable of automatically carrying out modifications to the cloud -- deploying, updating, or destroying resources -- to bring the actual infrastructure into alignment with the IaC configuration. However, when IaC is used alongside consoles, CLIs, or SDKs, it loses visibility into external changes, causing infrastructure drift, where the configuration becomes outdated, and later IaC operations may undo valid updates or trigger errors. We present NSync, an automated system for IaC reconciliation that propagates out-of-band changes back into the IaC program. Our key insight is that infrastructure changes eventually all occur via cloud API invocations -- the lowest layer for cloud management operations. NSync gleans insights from API traces to detect drift (i.e., non-IaC changes) and reconcile it (i.e., update the IaC configuration to capture the changes). It employs an agentic architecture that leverages LLMs to infer high-level intents from noisy API sequences, synthesize targeted IaC updates using specialized tools, and continually improve through a self-evolving knowledge base of past reconciliations. We further introduce a novel evaluation pipeline for injecting realistic drifts into cloud infrastructure and assessing reconciliation performance. Experiments across five real-world Terraform projects and 372 drift scenarios show that NSync outperforms the baseline both in terms of accuracy (from 0.71 to 0.97 pass@3) and token efficiency (1.47$\times$ improvement).

11.3SESep 19, 2025

MatchFixAgent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair

Ali Reza Ibrahimzada, Brandon Paulsen, Reyhaneh Jabbarvand et al.

Code translation transforms source code from one programming language (PL) to another. Validating the functional equivalence of translation and repairing, if necessary, are critical steps in code translation. Existing automated validation and repair approaches struggle to generalize to many PLs due to high engineering overhead, and they rely on existing and often inadequate test suites, which results in false claims of equivalence and ineffective translation repair. We develop MatchFixAgent, a large language model (LLM)-based, PL-agnostic framework for equivalence validation and repair of translations. MatchFixAgent features a multi-agent architecture that divides equivalence validation into several sub-tasks to ensure thorough and consistent semantic analysis of the translation. Then it feeds this analysis to test agent to write and execute tests. Upon observing a test failure, the repair agent attempts to fix the translation bug. The final (in)equivalence decision is made by the verdict agent, considering semantic analyses and test execution results. We compare MatchFixAgent's validation and repair results with four repository-level code translation techniques. We use 2,219 translation pairs from their artifacts, which cover 6 PL pairs, and are collected from 24 GitHub projects totaling over 900K lines of code. Our results demonstrate that MatchFixAgent produces (in)equivalence verdicts for 99.2% of translation pairs, with the same equivalence validation result as prior work on 72.8% of them. When MatchFixAgent's result disagrees with prior work, we find that 60.7% of the time MatchFixAgent's result is actually correct. In addition, we show that MatchFixAgent can repair 50.6% of inequivalent translation, compared to prior work's 18.5%. This demonstrates that MatchFixAgent is far more adaptable to many PL pairs than prior work, while producing highly accurate validation results.

9.6LGJan 31, 2022

LinSyn: Synthesizing Tight Linear Bounds for Arbitrary Neural Network Activation Functions

Brandon Paulsen, Chao Wang

The most scalable approaches to certifying neural network robustness depend on computing sound linear lower and upper bounds for the network's activation functions. Current approaches are limited in that the linear bounds must be handcrafted by an expert, and can be sub-optimal, especially when the network's architecture composes operations using, for example, multiplication such as in LSTMs and the recently popular Swish activation. The dependence on an expert prevents the application of robustness certification to developments in the state-of-the-art of activation functions, and furthermore the lack of tightness guarantees may give a false sense of insecurity about a particular model. To the best of our knowledge, we are the first to consider the problem of automatically computing tight linear bounds for arbitrary n-dimensional activation functions. We propose LinSyn, the first approach that achieves tight bounds for any arbitrary activation function, while only leveraging the mathematical definition of the activation function itself. Our approach leverages an efficient heuristic approach to synthesize bounds that are tight and usually sound, and then verifies the soundness (and adjusts the bounds if necessary) using the highly optimized branch-and-bound SMT solver, dReal. Even though our approach depends on an SMT solver, we show that the runtime is reasonable in practice, and, compared with state of the art, our approach often achieves 2-5X tighter final output bounds and more than quadruple certified robustness.

14.3LGSep 21, 2020

NeuroDiff: Scalable Differential Verification of Neural Networks using Fine-Grained Approximation

Brandon Paulsen, Jingbo Wang, Jiawei Wang et al.

As neural networks make their way into safety-critical systems, where misbehavior can lead to catastrophes, there is a growing interest in certifying the equivalence of two structurally similar neural networks. For example, compression techniques are often used in practice for deploying trained neural networks on computationally- and energy-constrained devices, which raises the question of how faithfully the compressed network mimics the original network. Unfortunately, existing methods either focus on verifying a single network or rely on loose approximations to prove the equivalence of two networks. Due to overly conservative approximation, differential verification lacks scalability in terms of both accuracy and computational cost. To overcome these problems, we propose NeuroDiff, a symbolic and fine-grained approximation technique that drastically increases the accuracy of differential verification while achieving many orders-of-magnitude speedup. NeuroDiff has two key contributions. The first one is new convex approximations that more accurately bound the difference neurons of two networks under all possible inputs. The second one is judicious use of symbolic variables to represent neurons whose difference bounds have accumulated significant error. We also find that these two techniques are complementary, i.e., when combined, the benefit is greater than the sum of their individual benefits. We have evaluated NeuroDiff on a variety of differential verification tasks. Our results show that NeuroDiff is up to 1000X faster and 5X more accurate than the state-of-the-art tool.

7.9LGJul 20, 2020

DiffRNN: Differential Verification of Recurrent Neural Networks

Sara Mohammadinejad, Brandon Paulsen, Chao Wang et al.

Recurrent neural networks (RNNs) such as Long Short Term Memory (LSTM) networks have become popular in a variety of applications such as image processing, data classification, speech recognition, and as controllers in autonomous systems. In practical settings, there is often a need to deploy such RNNs on resource-constrained platforms such as mobile phones or embedded devices. As the memory footprint and energy consumption of such components become a bottleneck, there is interest in compressing and optimizing such networks using a range of heuristic techniques. However, these techniques do not guarantee the safety of the optimized network, e.g., against adversarial inputs, or equivalence of the optimized and original networks. To address this problem, we propose DIFFRNN, the first differential verification method for RNNs to certify the equivalence of two structurally similar neural networks. Existing work on differential verification for ReLUbased feed-forward neural networks does not apply to RNNs where nonlinear activation functions such as Sigmoid and Tanh cannot be avoided. RNNs also pose unique challenges such as handling sequential inputs, complex feedback structures, and interactions between the gates and states. In DIFFRNN, we overcome these challenges by bounding nonlinear activation functions with linear constraints and then solving constrained optimization problems to compute tight bounding boxes on nonlinear surfaces in a high-dimensional space. The soundness of these bounding boxes is then proved using the dReal SMT solver. We demonstrate the practical efficacy of our technique on a variety of benchmarks and show that DIFFRNN outperforms state-of-the-art RNN verification tools such as POPQORN.

4.9CRSep 12, 2019

Debreach: Mitigating Compression Side Channels via Static Analysis and Transformation

Brandon Paulsen, Chungha Sung, Peter A. H. Peterson et al.

Compression is an emerging source of exploitable side-channel leakage that threatens data security, particularly in web applications where compression is indispensable for performance reasons. Current approaches to mitigating compression side channels have drawbacks in that they either degrade compression ratio drastically or require too much effort from developers to be widely adopted. To bridge the gap, we develop Debreach, a static analysis and program transformation based approach to mitigating compression side channels. Debreach consists of two steps. First, it uses taint analysis to soundly identify flows of sensitive data in the program and uses code instrumentation to annotate data before feeding them to the compressor. Second, it enhances the compressor to exploit the freedom to not compress of standard compression protocols, thus removing the dependency between sensitive data and the size of the compressor's output. Since Debreach automatically instruments applications and does not change the compression protocols, it has the advantage of being non-disruptive and compatible with existing systems. We have evaluated Debreach on a set of web server applications written in PHP. Our experiments show that, while ensuring leakage-freedom, Debreach can achieve significantly higher compression performance than state-of-the-art approaches.

12.9SEJul 9, 2018Code

CANAL: A Cache Timing Analysis Framework via LLVM Transformation

Chungha Sung, Brandon Paulsen, Chao Wang

A unified modeling framework for non-functional properties of a program is essential for research in software analysis and verification, since it reduces burdens on individual researchers to implement new approaches and compare existing approaches. We present CANAL, a framework that models the cache behaviors of a program by transforming its intermediate representation in the LLVM compiler. CANAL inserts auxiliary variables and instructions over these variables, to allow standard verification tools to handle a new class of cache related properties, e.g., for computing the worst-case execution time and detecting side-channel leaks. We demonstrate the effectiveness of CANAL using three verification tools: KLEE, SMACK and Crab-llvm. We confirm the accuracy of our cache model by comparing with CPU cycle-accurate simulation results of GEM5. CANAL is available on GitHub and YouTube.