Duncan Eddy

SY
h-index23
11papers
40citations
Novelty52%
AI Score52

11 Papers

CLJul 12, 2024
ASTPrompter: Preference-Aligned Automated Language Model Red-Teaming to Generate Low-Perplexity Unsafe Prompts

Amelia F. Hardy, Houjun Liu, Allie Griffith et al.

Existing LLM red-teaming approaches prioritize high attack success rate, often resulting in high-perplexity prompts. This focus overlooks low-perplexity attacks that are more difficult to filter, more likely to arise during benign usage, and more impactful as negative downstream training examples. In response, we introduce ASTPrompter, a single-step optimization method that uses contrastive preference learning to train an attacker to maintain low perplexity while achieving a high attack success rate (ASR). ASTPrompter achieves an attack success rate 5.1 times higher on Llama-8.1B while using inputs that are 2.1 times more likely to occur according to the frozen LLM. Furthermore, our attack transfers to Mistral-7B, Qwen-7B, and TinyLlama in both black- and white-box settings. Lastly, by tuning a single hyperparameter in our method, we discover successful attack prefixes along an efficient frontier between ASR and perplexity, highlighting perplexity as a previously under-considered factor in red-teaming.

83.4SYMar 16
Optimizing Task Completion Time Updates Using POMDPs

Duncan Eddy, Esen Yel, Emma Passmore et al.

Managing announced task completion times is a fundamental control problem in project management. While extensive research exists on estimating task durations and task scheduling, the problem of when and how to update completion times communicated to stakeholders remains understudied. Organizations must balance announcement accuracy against the costs of frequent timeline updates, which can erode stakeholder trust and trigger costly replanning. Despite the prevalence of this problem, current approaches rely on static predictions or ad-hoc policies that fail to account for the sequential nature of announcement management. In this paper, we formulate the task announcement problem as a Partially Observable Markov Decision Process (POMDP) where the control policy must decide when to update announced completion times based on noisy observations of true task completion. Since most state variables (current time and previous announcements) are fully observable, we leverage the Mixed Observability MDP (MOMDP) framework to enable more efficient policy optimization. Our reward structure captures the dual costs of announcement errors and update frequency, enabling synthesis of optimal announcement control policies. Using off-the-shelf solvers, we generate policies that act as feedback controllers, adaptively managing announcements based on belief state evolution. Simulation results demonstrate significant improvements in both accuracy and announcement stability compared to baseline strategies, achieving up to 75\% reduction in unnecessary updates while maintaining or improving prediction accuracy.

AIJan 26
Expert Evaluation and the Limits of Human Feedback in Mental Health AI Safety Testing

Kiana Jafari, Paul Ulrich Nikolaus Rust, Duncan Eddy et al.

Learning from human feedback~(LHF) assumes that expert judgments, appropriately aggregated, yield valid ground truth for training and evaluating AI systems. We tested this assumption in mental health, where high safety stakes make expert consensus essential. Three certified psychiatrists independently evaluated LLM-generated responses using a calibrated rubric. Despite similar training and shared instructions, inter-rater reliability was consistently poor ($ICC$ $0.087$--$0.295$), falling below thresholds considered acceptable for consequential assessment. Disagreement was highest on the most safety-critical items. Suicide and self-harm responses produced greater divergence than any other category, and was systematic rather than random. One factor yielded negative reliability (Krippendorff's $α= -0.203$), indicating structured disagreement worse than chance. Qualitative interviews revealed that disagreement reflects coherent but incompatible individual clinical frameworks, safety-first, engagement-centered, and culturally-informed orientations, rather than measurement error. By demonstrating that experts rely on holistic risk heuristics rather than granular factor discrimination, these findings suggest that aggregated labels function as arithmetic compromises that effectively erase grounded professional philosophies. Our results characterize expert disagreement in safety-critical AI as a sociotechnical phenomenon where professional experience introduces sophisticated layers of principled divergence. We discuss implications for reward modeling, safety classification, and evaluation benchmarks, recommending that practitioners shift from consensus-based aggregation to alignment methods that preserve and learn from expert disagreement.

84.6SYMay 9
Fault-Aware MPC for Robotic Fleet Communications Scheduling

Carlo Schreiber, Duncan Eddy, Mykel J. Kochenderfer

Operating a fleet of remote robotic systems with intermittent communications requires scheduling limited contact opportunities to maintain fleet health awareness, complete mission objectives, and intervene on faulted assets before their permanent loss. This scheduling problem is complicated by observational ambiguity: when an asset fails to check in, the operator cannot distinguish between a lethal hardware fault and a benign communications failure. If the system's failure modes are structured through a fault model, a scheduler can exploit mode-specific lethality, timing, and recoverability properties to prioritize correctly - but only if it can distinguish between modes that produce identical observations under standard actions. We present Interacting Multiple Model Model Predictive Control (IMM-MPC), a receding-horizon framework that maintains a probabilistic belief over discrete fault modes with time-inhomogeneous dynamics and optimizes a two-term objective coupling acquisition value with information gain. We characterize when observationally aliased fault modes can be disambiguated through scheduled actions and when aliasing is permanently unresolvable. Applied to satellite launch and early orbit communications scheduling, IMM-MPC recovers 59.8% of spacecraft experiencing lethal-faults versus 9.0% for binary-MPC and 2.0% for a bipartite graph-based formulation solved through matching. These results hold across 200 randomized trials, while maintaining identical acquisition of healthy satellites and near-identical solve times.

92.4CRMay 8
SecureForge: Finding and Preventing Vulnerabilities in LLM-Generated Code via Prompt Optimization

Houjun Liu, Lisa Einstein, John Yang et al.

LLM coding agents now generate code at an unprecedented scale, yet LLM-generated code introduces cybersecurity vulnerabilities into codebases without human involvement. Even when frontier models are explicitly asked to write secure production code with relevant weaknesses to avoid in context, we find that they still produce verifiable vulnerabilities on average 23% of the time across a corpus of 250 benign coding prompts. We introduce SecureForge, an automated pipeline that both audits security risks of frontier models and produces auditing-informed secure system prompts that reduce output security vulnerabilities while maintaining unit test performance. SecureForge first identifies benign prompts that produce statically detectable vulnerabilities, and then amplifies them into a large synthetic prompt corpus of diverse scenarios using a Markovian sampling technique to jointly maintain error rates and prompt diversity. This corpus is then used to iteratively optimize the system prompts to reduce output security vulnerabilities. On frontier models, SecureForge yields a statistically significant Pareto improvement in both unit test success and output security, with output vulnerabilities reduced by up to 48%. The resulting system prompts transfer zero-shot to in-the-wild coding agent prompts, without any exposure to real user prompt distributions during optimization.

LGDec 24, 2024
Beyond Gradient Averaging in Parallel Optimization: Improved Robustness through Gradient Agreement Filtering

Francois Chaubard, Duncan Eddy, Mykel J. Kochenderfer

We introduce Gradient Agreement Filtering (GAF) to improve on gradient averaging in distributed deep learning optimization. Traditional distributed data-parallel stochastic gradient descent involves averaging gradients of microbatches to calculate a macrobatch gradient that is then used to update model parameters. We find that gradients across microbatches are often orthogonal or negatively correlated, especially in late stages of training, which leads to memorization of the training set, reducing generalization. In this paper, we introduce a simple, computationally effective way to reduce gradient variance by computing the cosine distance between micro-gradients during training and filtering out conflicting updates prior to averaging. We improve validation accuracy with significantly smaller microbatch sizes. We also show this reduces memorizing noisy labels. We demonstrate the effectiveness of this technique on standard image classification benchmarks including CIFAR-100 and CIFAR-100N-Fine. We show this technique consistently outperforms validation accuracy, in some cases by up to 18.2\% compared to traditional training approaches while reducing the computation required nearly an order of magnitude because we can now rely on smaller microbatch sizes without destabilizing training.

ROMar 7
Foundational World Models Accurately Detect Bimanual Manipulator Failures

Isaac R. Ward, Michelle Ho, Houjun Liu et al.

Deploying visuomotor robots at scale is challenging due to the potential for anomalous failures to degrade performance, cause damage, or endanger human life. Bimanual manipulators are no exception; these robots have vast state spaces comprised of high-dimensional images and proprioceptive signals. Explicitly defining failure modes within such state spaces is infeasible. In this work, we overcome these challenges by training a probabilistic, history informed, world model within the compressed latent space of a pretrained vision foundation model (NVIDIA's Cosmos Tokenizer). The model outputs uncertainty estimates alongside its predictions that serve as non-conformity scores within a conformal prediction framework. We use these scores to develop a runtime monitor, correlating periods of high uncertainty with anomalous failures. To test these methods, we use the simulated Push-T environment and the Bimanual Cable Manipulation dataset, the latter of which we introduce in this work. This new dataset features trajectories with multiple synchronized camera views, proprioceptive signals, and annotated failures from a challenging data center maintenance task. We benchmark our methods against baselines from the anomaly detection and out-of-distribution detection literature, and show that our approach considerably outperforms statistical techniques. Furthermore, we show that our approach requires approximately one twentieth of the trainable parameters as the next-best learning-based approach, yet outperforms it by 3.8% in terms of failure detection rate, paving the way toward safely deploying manipulator robots in real-world environments where reliability is non-negotiable.

ROOct 9, 2025
Adaptive Science Operations in Deep Space Missions Using Offline Belief State Planning

Grace Ra Kim, Hailey Warner, Duncan Eddy et al.

Deep space missions face extreme communication delays and environmental uncertainty that prevent real-time ground operations. To support autonomous science operations in communication-constrained environments, we present a partially observable Markov decision process (POMDP) framework that adaptively sequences spacecraft science instruments. We integrate a Bayesian network into the POMDP observation space to manage the high-dimensional and uncertain measurements typical of astrobiology missions. This network compactly encodes dependencies among measurements and improves the interpretability and computational tractability of science data. Instrument operation policies are computed offline, allowing resource-aware plans to be generated and thoroughly validated prior to launch. We use the Enceladus Orbilander's proposed Life Detection Suite (LDS) as a case study, demonstrating how Bayesian network structure and reward shaping influence system performance. We compare our method against the mission's baseline Concept of Operations (ConOps), evaluating both misclassification rates and performance in off-nominal sample accumulation scenarios. Our approach reduces sample identification errors by nearly 40%

NIOct 3, 2025
Scalable Ground Station Selection for Large LEO Constellations

Grace Ra Kim, Duncan Eddy, Vedant Srinivas et al.

Effective ground station selection is critical for low Earth orbiting (LEO) satellite constellations to minimize operational costs, maximize data downlink volume, and reduce communication gaps between access windows. Traditional ground station selection typically begins by choosing from a fixed set of locations offered by Ground Station-as-a-Service (GSaaS) providers, which helps reduce the problem scope to optimizing locations over existing infrastructure. However, finding a globally optimal solution for stations using existing mixed-integer programming methods quickly becomes intractable at scale, especially when considering multiple providers and large satellite constellations. To address this issue, we introduce a scalable, hierarchical framework that decomposes the global selection problem into single-satellite, short time-window subproblems. Optimal station choices from each subproblem are clustered to identify consistently high-value locations across all decomposed cases. Cluster-level sets are then matched back to the closest GSaaS candidate sites to produce a globally feasible solution. This approach enables scalable coordination while maintaining near-optimal performance. We evaluate our method's performance on synthetic Walker-Star test cases (1-10 satellites, 1-10 stations), achieving solutions within 95% of the global IP optimum for all test cases. Real-world evaluations on Capella Space (5 satellites), ICEYE (40), and Planet's Flock (96) show that while exact IP solutions fail to scale, our framework continues to deliver high-quality site selections.

AIAug 15, 2020
A Maximum Independent Set Method for Scheduling Earth Observing Satellite Constellations

Duncan Eddy, Mykel J. Kochenderfer

Operating Earth observing satellites requires efficient planning methods that coordinate activities of multiple spacecraft. The satellite task planning problem entails selecting actions that best satisfy mission objectives for autonomous execution. Task scheduling is often performed by human operators assisted by heuristic or rule-based planning tools. This approach does not efficiently scale to multiple assets as heuristics frequently fail to properly coordinate actions of multiple vehicles over long horizons. Additionally, the problem becomes more difficult to solve for large constellations as the complexity of the problem scales exponentially in the number of requested observations and linearly in the number of spacecraft. It is expected that new commercial optical and radar imaging constellations will require automated planning methods to meet stated responsiveness and throughput objectives. This paper introduces a new approach for solving the satellite scheduling problem by generating an infeasibility-based graph representation of the problem and finding a maximal independent set of vertices for the graph. The approach is tested on a scenarios of up to 10,000 requested imaging locations for the Skysat constellation of optical satellites as well as simulated constellations of up to 24 satellites. Performance is compared with contemporary graph-traversal and mixed-integer linear programming approaches. Empirical results demonstrate improvements in both the solution time along with the number of scheduled collections beyond baseline methods. For large problems, the maximum independent set approach is able find a feasible schedule with 8% more collections in 75% less time.

SYMay 4, 2019
Satellite Image Tasking Under Orbit Prediction Uncertainty

Duncan Eddy, Mykel Kochenderfer

Small satellites have proven to be viable Earth observation platforms. These satellites operate in regimes of increased trajectory uncertainty where traditional planning approaches can lead to sub-optimal task plans, limiting science return. Previous formulations of the space mission planning problem decouple trajectory prediction and planning, which leads to task plans that are less robust to uncertainty. We present a Markov decision process formulation of the problem that accounts for uncertainties by incorporating a distribution of possible collection windows characterized through Monte Carlo simulation. An approximate solution technique yields tasking schedules with rewards comparable to the conventional methods while simultaneously reducing the variations caused by uncertainties and improving runtime.