Saurabh Amin

SY
h-index67
27papers
39citations
Novelty49%
AI Score52

27 Papers

SYMar 22, 2016
Sensor placement for fault location identification in water networks: A minimum test cover approach

Lina Sela Perelman, Waseem Abbas, Xenofon Koutsoukos et al.

This paper focuses on the optimal sensor placement problem for the identification of pipe failure locations in large-scale urban water systems. The problem involves selecting the minimum number of sensors such that every pipe failure can be uniquely localized. This problem can be viewed as a minimum test cover (MTC) problem, which is NP-hard. We consider two approaches to obtain approximate solutions to this problem. In the first approach, we transform the MTC problem to a minimum set cover (MSC) problem and use the greedy algorithm that exploits the submodularity property of the MSC problem to compute the solution to the MTC problem. In the second approach, we develop a new \textit{augmented greedy} algorithm for solving the MTC problem. This approach does not require the transformation of the MTC to MSC. Our augmented greedy algorithm provides in a significant computational improvement while guaranteeing the same approximation ratio as the first approach. We propose several metrics to evaluate the performance of the sensor placement designs. Finally, we present detailed computational experiments for a number of real water distribution networks.

GTJul 11, 2020
Network Inspection for Detecting Strategic Attacks

Mathieu Dahan, Lina Sela, Saurabh Amin

This article studies a problem of strategic network inspection, in which a defender (agency) is tasked with detecting the presence of multiple attacks in the network. An inspection strategy entails monitoring the network components, possibly in a randomized manner, using a given number of detectors. We formulate the network inspection problem $(\mathcal{P})$ as a large-scale bilevel optimization problem, in which the defender seeks to determine an inspection strategy with minimum number of detectors that ensures a target expected detection rate under worst-case attacks. We show that optimal solutions of $(\mathcal{P})$ can be obtained from the equilibria of a large-scale zero-sum game. Our equilibrium analysis involves both game-theoretic and combinatorial arguments, and leads to a computationally tractable approach to solve $(\mathcal{P})$. Firstly, we construct an approximate solution by utilizing solutions of minimum set cover (MSC) and maximum set packing (MSP) problems, and evaluate its detection performance. In fact, this construction generalizes some of the known results in network security games. Secondly, we leverage properties of the optimal detection rate to iteratively refine our MSC/MSP-based solution through a column generation procedure. Computational results on benchmark water networks demonstrate the scalability, performance, and operational feasibility of our approach. The results indicate that utilities can achieve a high level of protection in large-scale networks by strategically positioning a small number of detectors.

LGMar 16, 2023
Learning Spatio-Temporal Aggregations for Large-Scale Capacity Expansion Problems

Aron Brenner, Rahman Khorramfar, Saurabh Amin

Effective investment planning decisions are crucial to ensure cyber-physical infrastructures satisfy performance requirements over an extended time horizon. Computing these decisions often requires solving Capacity Expansion Problems (CEPs). In the context of regional-scale energy systems, these problems are prohibitively expensive to solve due to large network sizes, heterogeneous node characteristics, and a large number of operational periods. To maintain tractability, traditional approaches aggregate network nodes and/or select a set of representative time periods. Often, these reductions do not capture supply-demand variations that crucially impact CEP costs and constraints, leading to suboptimal decisions. Here, we propose a novel graph convolutional autoencoder approach for spatio-temporal aggregation of a generic CEP with heterogeneous nodes (CEPHN). Our architecture leverages graph pooling to identify nodes with similar characteristics and minimizes a multi-objective loss function. This loss function is tailored to induce desirable spatial and temporal aggregations with regard to tractability and optimality. In particular, the output of the graph pooling provides a spatial aggregation while clustering the low-dimensional encoded representations yields a temporal aggregation. We apply our approach to generation expansion planning of a coupled 88-node power and natural gas system in New England. The resulting aggregation leads to a simpler CEPHN with 6 nodes and a small set of representative days selected from one year. We evaluate aggregation outcomes over a range of hyperparameters governing the loss function and compare resulting upper bounds on the original problem with those obtained using benchmark methods. We show that our approach provides upper bounds that are 33% (resp. 10%) lower those than obtained from benchmark spatial (resp. temporal) aggregation approaches.

GTJan 26, 2016
Network Flow Routing under Strategic Link Disruptions

Mathieu Dahan, Saurabh Amin

This paper considers a 2-player strategic game for network routing under link disruptions. Player 1 (defender) routes flow through a network to maximize her value of effective flow while facing transportation costs. Player 2 (attacker) simultaneously disrupts one or more links to maximize her value of lost flow but also faces cost of disrupting links. This game is strategically equivalent to a zero-sum game. Linear programming duality and the max-flow min-cut theorem are applied to obtain properties that are satisfied in any mixed Nash equilibrium. In any equilibrium, both players achieve identical payoffs. While the defender's expected transportation cost decreases in attacker's marginal value of lost flow, the attacker's expected cost of attack increases in defender's marginal value of effective flow. Interestingly, the expected amount of effective flow decreases in both these parameters. These results can be viewed as a generalization of the classical max-flow with minimum transportation cost problem to adversarial environments.

SYJul 2, 2019
Throughput-Improving Control of Highways Facing Stochastic Perturbations

Li Jin, Alexander A. Kurzhanskiy, Saurabh Amin

In this paper, we study the problem of traffic management in highways facing stochastic perturbations. To model the macroscopic traffic flow under perturbations, we use cell-transmission model with Markovian capacities. The decision variables are: (i) the admissible flow through each on-ramp, and (ii) whether individual on-ramps are metered to prioritize the mainline traffic or not. The objective is to maximize the throughput while ensuring that on-ramp queues remain bounded on average. We develop a computational approach to solving this stability-constrained, throughput-maximization problem. We establish a mixed-integer linear program for throughput maximization and construct an algorithm that gives the optimal solution in particular settings. We illustrate the performance benefits of the proposed approach through a computational study on a segment of Interstate 210 in California, USA.

GTJan 21, 2019
Security Games in Network Flow Problems

Mathieu Dahan, Saurabh Amin

This article considers a two-player strategic game for network routing under link disruptions. Player 1 (defender) routes flow through a network to maximize her value of effective flow while facing transportation costs. Player 2 (attacker) simultaneously disrupts one or more links to maximize her value of lost flow but also faces cost of disrupting links. Linear programming duality in zero-sum games and the Max-Flow Min-Cut Theorem are applied to obtain properties that are satisfied in any Nash equilibrium. A characterization of the support of the equilibrium strategies is provided using graph-theoretic arguments. Finally, conditions under which these results extend to budget-constrained environments are also studied. These results extend the classical minimum cost maximum flow problem and the minimum cut problem to a class of security games on flow networks.

LGSep 24, 2022
Graph Representation Learning for Energy Demand Data: Application to Joint Energy System Planning under Emissions Constraints

Aron Brenner, Rahman Khorramfar, Dharik Mallapragada et al.

A rapid transformation of current electric power and natural gas (NG) infrastructure is imperative to meet the mid-century goal of CO2 emissions reduction requires. This necessitates a long-term planning of the joint power-NG system under representative demand and supply patterns, operational constraints, and policy considerations. Our work is motivated by the computational and practical challenges associated with solving the generation and transmission expansion problem (GTEP) for joint planning of power-NG systems. Specifically, we focus on efficiently extracting a set of representative days from power and NG data in respective networks and using this set to reduce the computational burden required to solve the GTEP. We propose a Graph Autoencoder for Multiple time resolution Energy Systems (GAMES) to capture the spatio-temporal demand patterns in interdependent networks and account for differences in the temporal resolution of available data. The resulting embeddings are used in a clustering algorithm to select representative days. We evaluate the effectiveness of our approach in solving a GTEP formulation calibrated for the joint power-NG system in New England. This formulation accounts for the physical interdependencies between power and NG systems, including the joint emissions constraint. Our results show that the set of representative days obtained from GAMES not only allows us to tractably solve the GTEP formulation, but also achieves a lower cost of implementing the joint planning decisions.

OCFeb 16, 2018
Stability of Fluid Queuing Systems with Parallel Servers and Stochastic Capacities

Li Jin, Saurabh Amin

This note introduces a piecewise-deterministic queueing (PDQ) model to study the stability of traffic queues in parallel-link transportation systems facing stochastic capacity fluctuations. The saturation rate (capacity) of the PDQ model switches between a finite set of modes according to a Markov chain, and link inflows are controlled by a state-feedback policy. A PDQ system is stable only if a lower bound on the time-average link inflows does not exceed the corresponding time-average saturation rate. Furthermore, a PDQ system is stable if the following two conditions hold: the nominal mode's saturation rate is high enough that all queues vanish in this mode, and a bilinear matrix inequality (BMI) involving an underestimate of the discharge rates of the PDQ in individual modes is feasible. The stability conditions can be strengthened for two-mode PDQs. These results can be used for design of routing policies that guarantee stability of traffic queues under stochastic capacity fluctuations.

LGFeb 14, 2023
Effective Dimension in Bandit Problems under Censorship

Gauthier Guinet, Saurabh Amin, Patrick Jaillet

In this paper, we study both multi-armed and contextual bandit problems in censored environments. Our goal is to estimate the performance loss due to censorship in the context of classical algorithms designed for uncensored environments. Our main contributions include the introduction of a broad class of censorship models and their analysis in terms of the effective dimension of the problem -- a natural measure of its underlying statistical complexity and main driver of the regret bound. In particular, the effective dimension allows us to maintain the structure of the original problem at first order, while embedding it in a bigger space, and thus naturally leads to results analogous to uncensored settings. Our analysis involves a continuous generalization of the Elliptical Potential Inequality, which we believe is of independent interest. We also discover an interesting property of decision-making under censorship: a transient phase during which initial misspecification of censorship is self-corrected at an extra cost, followed by a stationary phase that reflects the inherent slowdown of learning governed by the effective dimension. Our results are useful for applications of sequential decision-making models where the feedback received depends on strategic uncertainty (e.g., agents' willingness to follow a recommendation) and/or random uncertainty (e.g., loss or delay in arrival of information).

LGMar 27, 2022
Interpretable Machine Learning Models for Modal Split Prediction in Transportation Systems

Aron Brenner, Manxi Wu, Saurabh Amin

Modal split prediction in transportation networks has the potential to support network operators in managing traffic congestion and improving transit service reliability. We focus on the problem of hourly prediction of the fraction of travelers choosing one mode of transportation over another using high-dimensional travel time data. We use logistic regression as base model and employ various regularization techniques for variable selection to prevent overfitting and resolve multicollinearity issues. Importantly, we interpret the prediction accuracy results with respect to the inherent variability of modal splits and travelers' aggregate responsiveness to changes in travel time. By visualizing model parameters, we conclude that the subset of segments found important for predictive accuracy changes from hour-to-hour and include segments that are topologically central and/or highly congested. We apply our approach to the San Francisco Bay Area freeway and rapid transit network and demonstrate superior prediction accuracy and interpretability of our method compared to pre-specified variable selection methods.

SYApr 13
Strategic Spatial Load Shifting and Market Efficiency

Aron Brenner, Deepjyoti Deka, Line Roald et al.

Large, spatially flexible electricity consumers such as data centers can reallocate demand across locations, influencing dispatch and prices in wholesale electricity markets. While flexible load is often assumed to improve system efficiency, this intuition typically relies on price-taking behavior. We study price-anticipatory spatial load shifting by modeling a large flexible consumer as a Stackelberg leader interacting with DC optimal power flow (DC-OPF) based market clearing. We show that decentralized, cost-minimizing load shifting need not align with system operating cost minimization, and that misalignment arises at boundaries between DC-OPF operating regimes, where small changes in load can induce discrete changes in marginal generators or congestion patterns. We evaluate strategic load shifting on the 73-bus RTS-GMLC test system, where findings indicate reductions in system operating cost in most hours, but misalignment in a subset of cases that are driven by redispatch at merit-order discontinuities. We find that these outcomes are primarily redistributive relative to a price-taking benchmark, reducing generator profits while lowering electricity procurement costs for both flexible and inflexible consumers, even in cases where total system operating costs increase.

OCJun 30, 2023
Uncertainty Informed Optimal Resource Allocation with Gaussian Process based Bayesian Inference

Samarth Gupta, Saurabh Amin

We focus on the problem of uncertainty informed allocation of medical resources (vaccines) to heterogeneous populations for managing epidemic spread. We tackle two related questions: (1) For a compartmental ordinary differential equation (ODE) model of epidemic spread, how can we estimate and integrate parameter uncertainty into resource allocation decisions? (2) How can we computationally handle both nonlinear ODE constraints and parameter uncertainties for a generic stochastic optimization problem for resource allocation? To the best of our knowledge current literature does not fully resolve these questions. Here, we develop a data-driven approach to represent parameter uncertainty accurately and tractably in a novel stochastic optimization problem formulation. We first generate a tractable scenario set by estimating the distribution on ODE model parameters using Bayesian inference with Gaussian processes. Next, we develop a parallelized solution algorithm that accounts for scenario-dependent nonlinear ODE constraints. Our scenario-set generation procedure and solution approach are flexible in that they can handle any compartmental epidemiological ODE model. Our computational experiments on two different non-linear ODE models (SEIR and SEPIHR) indicate that accounting for uncertainty in key epidemiological parameters can improve the efficacy of time-critical allocation decisions by 4-8%. This improvement can be attributed to data-driven and optimal (strategic) nature of vaccine allocations, especially in the early stages of the epidemic when the allocation strategy can crucially impact the long-term trajectory of the disease.

SYApr 12
Optimization Under Uncertainty for Energy Infrastructure Planning: A Synthesis of Methods, Tools, and Open Challenges

Rahman Khorramfar, Aron Brenner, Lara Booth et al.

Energy infrastructure planning under uncertainty has become increasingly complex as electrification, interdependence between energy carriers, decarbonization, and extreme weather events reshape long-term investment decisions. This paper surveys recent advances at the intersection of generation and transmission expansion, and optimization under uncertainty, with a focus on stochastic programming, robust optimization, and distributionally robust optimization. We then categorize modeling needs along the axes of modeling fidelity, uncertainty characterization, and solution methods to identify dominant modeling features and trace research gaps. We further examine emerging directions at the interface of optimization and machine learning, including surrogate modeling, learning uncertainty sets, probabilistic forecasting, and synthetic scenarios, and discuss how these tools can be embedded within infrastructure planning models.

SYApr 19
Structural Misalignment in Financial Transmission Rights

Erich Trieschman, Saurabh Amin

Financial Transmission Rights (FTRs) enable electricity market participants to hedge congestion risk in Day Ahead Market (DAM) operations, but for the market to be solvent, Independent System Operators (ISOs) must ensure that FTR payouts do not exceed the collected DAM merchandising surplus that funds them. We show that FTR underfunding (or conversely, hedging efficiency) can arise structurally from misalignment between the network models used in the FTR auction and the DAM, independent of bidding behavior. We develop a geometric framework in which both DAM merchandising surplus and the maximum supportable FTR payout are expressed as support functions of network-feasible injection polytopes. The resulting dual representation assigns nonnegative weights to transmission element-contingency constraints, enabling constraint-level attribution of model misalignment. Using this framework, we derive sharp implications for canonical FTR network modeling choices like uniform transmission element derates, and for structural sources of underfunding like unplanned DAM outages. We further show that multi-interval FTR products impose an intrinsic hedging inefficiency when DAM shadow prices vary over time, even under perfect model alignment. These results provide ISOs with rigorous tools to diagnose underfunding and quantify the efficiency cost of conservative FTR network modeling choices.

SYApr 3
How Sensor Attacks Transfer Across Lie Groups

Rijad Alisic, Saurabh Amin

Sensor spoofing analysis in cyber-physical systems is predominantly confined to linear state spaces, where attack transferability is trivial. On Lie groups, however, the noncommutativity of the dynamics can distort certain sensor attacks, exposing nominally stealthy attacks during complex maneuvers. We present a geometric framework characterizing when a sensor attack can transfer across operating conditions, preserving both its physical impact and stealthiness. We prove that successful transfer requires the attack to commute with the nominal dynamics (a Lie bracket condition), which isolates transferable attacks to an invariant subspace, while attacks outside this subspace identifiably alter residuals. For small deviations from ideal transferable attacks, our decomposition theorem reveals a fundamental asymmetry: the flow's Adjoint action amplifies the physical impact of the bracket-violating component. Furthermore, although the attack perturbs the innovation linearly, the accumulated error drift undergoes distortion via the Adjoint action. Finally, we demonstrate how turning maneuvers on a Dubins unicycle collapse the transferable subspace to a single direction, verifying that imperfect attacks remain within theoretical detection bounds.

AIMay 11
Budget-Efficient Automatic Algorithm Design via Code Graph

Maxime Bouscary, Manxi Wu, Saurabh Amin

Large language models (LLMs) have emerged as powerful tools for automatic algorithm design (AAD). However, existing pipelines remain inefficient. They operate at the granularity of full algorithms, redundantly rewriting recurring substructures and discarding low-fitness candidates that may contain valuable algorithmic features. We formalize budget-efficient automatic algorithm design, wherein the search policy maximizes realized fitness subject to limited computational cost. We propose a directed acyclic graph representation of algorithms and build a search framework that fully exploits the LLM's output. Instead of querying the LLM for full algorithms, we use it to obtain corrections: compact operators that add, replace, or remove code blocks. Each correction augments the graph, yielding new algorithms that compose with prior corrections. This graph structure decomposes algorithms into sets of corrections, enabling correction-level credit assignment that informs subsequent queries. We complement this framework with theoretical insights into the ideal balance between search depth and breadth at different budget levels. We validate our method empirically on three combinatorial optimization problems, demonstrating consistent superiority of our graph-based search over full-algorithm search at equal token budget. Finally, our experiments suggest that rich contexts help only when the LLM's prior knowledge is shallow, and can hinder performance otherwise.

SYSep 5, 2024
A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization

Aron Brenner, Rahman Khorramfar, Jennifer Sun et al.

Two-stage adaptive robust optimization (ARO) is a powerful approach for planning under uncertainty, balancing first-stage decisions with recourse decisions made after uncertainty is realized. To account for uncertainty, modelers typically define a simple uncertainty set over which potential outcomes are considered. However, classical methods for defining these sets unintentionally capture a wide range of unrealistic outcomes, resulting in overly-conservative and costly planning in anticipation of unlikely contingencies. In this work, we introduce AGRO, a solution algorithm that performs adversarial generation for two-stage adaptive robust optimization using a variational autoencoder. AGRO generates high-dimensional contingencies that are simultaneously adversarial and realistic, improving the robustness of first-stage decisions at a lower planning cost than standard methods. To ensure generated contingencies lie in high-density regions of the uncertainty distribution, AGRO defines a tight uncertainty set as the image of "latent" uncertainty sets under the VAE decoding transformation. Projected gradient ascent is then used to maximize recourse costs over the latent uncertainty sets by leveraging differentiable optimization methods. We demonstrate the cost-efficiency of AGRO by applying it to both a synthetic production-distribution problem and a real-world power system expansion setting. We show that AGRO outperforms the standard column-and-constraint algorithm by up to 1.8% in production-distribution planning and up to 11.6% in power system expansion.

OCMar 19
Learning Decision-Sufficient Representations for Linear Optimization

Yuhan Ye, Saurabh Amin, Asuman Ozdaglar

We study how to construct compressed datasets that suffice to recover optimal decisions in linear programs with an unknown cost vector $c$ lying in a prior set $\mathcal{C}$. Recent work by Bennouna et al. provides an exact geometric characterization of sufficient decision datasets (SDDs) via an intrinsic decision-relevant dimension $d^\star$. However, their algorithm for constructing minimum-size SDDs requires solving mixed-integer programs. In this paper, we establish hardness results showing that computing $d^\star$ is NP-hard and deciding whether a dataset is globally sufficient is coNP-hard, thereby resolving a recent open problem posed by Bennouna et al. To address this worst-case intractability, we introduce pointwise sufficiency, a relaxation that requires sufficiency for an individual cost vector. Under nondegeneracy, we provide a polynomial-time cutting-plane algorithm for constructing pointwise-sufficient decision datasets. In a data-driven regime with i.i.d.\ costs, we further propose a cumulative algorithm that aggregates decision-relevant directions across samples, yielding a stable compression scheme of size at most $d^\star$. This leads to a distribution-free PAC guarantee: with high probability over the training sample, the pointwise sufficiency failure probability on a fresh draw is at most $\tilde{O}(d^\star/n)$, and this rate is tight up to logarithmic factors. Finally, we apply decision-sufficient representations to contextual linear optimization, obtaining compressed predictors with generalization bounds scaling as $\tilde{O}(\sqrt{d^\star/n})$ rather than $\tilde{O}(\sqrt{d/n})$, where $d$ is the ambient cost dimension.

MLOct 6, 2025
Modular and Adaptive Conformal Prediction for Sequential Models via Residual Decomposition

William Zhang, Saurabh Amin, Georgia Perakis

Conformal prediction offers finite-sample coverage guarantees under minimal assumptions. However, existing methods treat the entire modeling process as a black box, overlooking opportunities to exploit modular structure. We introduce a conformal prediction framework for two-stage sequential models, where an upstream predictor generates intermediate representations for a downstream model. By decomposing the overall prediction residual into stage-specific components, our method enables practitioners to attribute uncertainty to specific pipeline stages. We develop a risk-controlled parameter selection procedure using family-wise error rate (FWER) control to calibrate stage-wise scaling parameters, and propose an adaptive extension for non-stationary settings that preserves long-run coverage guarantees. Experiments on synthetic distribution shifts, as well as real-world supply chain and stock market data, demonstrate that our approach maintains coverage under conditions that degrade standard conformal methods, while providing interpretable stage-wise uncertainty attribution. This framework offers diagnostic advantages and robust coverage that standard conformal methods lack.

AIAug 4, 2025
OptiHive: Ensemble Selection for LLM-Based Optimization via Statistical Modeling

Maxime Bouscary, Saurabh Amin

LLM-based solvers have emerged as a promising means of automating problem modeling and solving. However, they remain unreliable and often depend on iterative repair loops that result in significant latency. We introduce OptiHive, a framework that enhances any solver-generation pipeline to produce higher-quality solvers from natural-language descriptions of optimization problems. OptiHive uses a single batched generation to produce diverse components (solvers, problem instances, and validation tests) and filters out erroneous components to ensure fully interpretable outputs. Accounting for the imperfection of the generated components, we employ a statistical model to infer their true performance, enabling principled uncertainty quantification and solver selection. On tasks ranging from traditional optimization problems to challenging variants of the Multi-Depot Vehicle Routing Problem, OptiHive significantly outperforms baselines, increasing the optimality rate from 5% to 92% on the most complex problems.

OCMay 27, 2025
What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization

Omar Bennouna, Amine Bennouna, Saurabh Amin et al.

We study the fundamental question of how informative a dataset is for solving a given decision-making task. In our setting, the dataset provides partial information about unknown parameters that influence task outcomes. Focusing on linear programs, we characterize when a dataset is sufficient to recover an optimal decision, given an uncertainty set on the cost vector. Our main contribution is a sharp geometric characterization that identifies the directions of the cost vector that matter for optimality, relative to the task constraints and uncertainty set. We further develop a practical algorithm that, for a given task, constructs a minimal or least-costly sufficient dataset. Our results reveal that small, well-chosen datasets can often fully determine optimal decisions -- offering a principled foundation for task-aware data selection.

SYJan 19, 2024
Learning-assisted Stochastic Capacity Expansion Planning: A Bayesian Optimization Approach

Aron Brenner, Rahman Khorramfar, Dharik Mallapragada et al.

Solving large-scale capacity expansion problems (CEPs) is central to cost-effective decarbonization of regional-scale energy systems. To ensure the intended outcomes of CEPs, modeling uncertainty due to weather-dependent variable renewable energy (VRE) supply and energy demand becomes crucially important. However, the resulting stochastic optimization models are often less computationally tractable than their deterministic counterparts. Here, we propose a learning-assisted approximate solution method to tractably solve two-stage stochastic CEPs. Our method identifies low-cost planning decisions by constructing and solving a sequence of tractable temporally aggregated surrogate problems. We adopt a Bayesian optimization approach to searching the space of time series aggregation hyperparameters and compute approximate solutions that minimize costs on a validation set of supply-demand projections. Importantly, we evaluate solved planning outcomes on a held-out set of test projections. We apply our approach to generation and transmission expansion planning for a joint power-gas system spanning New England. We show that our approach yields an estimated cost savings of up to 3.8% in comparison to benchmark time series aggregation approaches.

IVNov 5, 2021
Damage Estimation and Localization from Sparse Aerial Imagery

Rene Garcia Franceschini, Jeffrey Liu, Saurabh Amin

Aerial images provide important situational awareness for responding to natural disasters such as hurricanes. They are well-suited for providing information for damage estimation and localization (DEL); i.e., characterizing the type and spatial extent of damage following a disaster. Despite recent advances in sensing and unmanned aerial systems technology, much of post-disaster aerial imagery is still taken by handheld DSLR cameras from small, manned, fixed-wing aircraft. However, these handheld cameras lack IMU information, and images are taken opportunistically post-event by operators. As such, DEL from such imagery is still a highly manual and time-consuming process. We propose an approach to both detect damage in aerial images and localize it in world coordinates, with specific focus on detecting and localizing flooding. The approach is based on using structure from motion to relate image coordinates to world coordinates via a projective transformation, using class activation mapping to detect the extent of damage in an image, and applying the projective transformation to localize damage in world coordinates. We evaluate the performance of our approach on post-event data from the 2016 Louisiana floods, and find that our approach achieves a precision of 88%. Given this high precision using limited data, we argue that this approach is currently viable for fast and effective DEL from handheld aerial imagery for disaster response.

LGOct 30, 2020
Integer Programming-based Error-Correcting Output Code Design for Robust Classification

Samarth Gupta, Saurabh Amin

Error-Correcting Output Codes (ECOCs) offer a principled approach for combining simple binary classifiers into multiclass classifiers. In this paper, we investigate the problem of designing optimal ECOCs to achieve both nominal and adversarial accuracy using Support Vector Machines (SVMs) and binary deep learning models. In contrast to previous literature, we present an Integer Programming (IP) formulation to design minimal codebooks with desirable error correcting properties. Our work leverages the advances in IP solvers to generate codebooks with optimality guarantees. To achieve tractability, we exploit the underlying graph-theoretic structure of the constraint set in our IP formulation. This enables us to use edge clique covers to substantially reduce the constraint set. Our codebooks achieve a high nominal accuracy relative to standard codebooks (e.g., one-vs-all, one-vs-one, and dense/sparse codes). We also estimate the adversarial accuracy of our ECOC-based classifiers in a white-box setting. Our IP-generated codebooks provide non-trivial robustness to adversarial perturbations even without any adversarial training.

SYOct 18, 2020
Multi-agent Bayesian Learning with Adaptive Strategies: Convergence and Stability

Manxi Wu, Saurabh Amin, Asuman Ozdaglar

We study learning dynamics induced by strategic agents who repeatedly play a game with an unknown payoff-relevant parameter. In each step, an information system estimates a belief distribution of the parameter based on the players' strategies and realized payoffs using Bayes' rule. Players adjust their strategies by accounting for an equilibrium strategy or a best response strategy based on the updated belief. We prove that beliefs and strategies converge to a fixed point with probability 1. We also provide conditions that guarantee local and global stability of fixed points. Any fixed point belief consistently estimates the payoff distribution given the fixed point strategy profile. However, convergence to a complete information Nash equilibrium is not always guaranteed. We provide a sufficient and necessary condition under which fixed point belief recovers the unknown parameter. We also provide a sufficient condition for convergence to complete information equilibrium even when parameter learning is incomplete.

CVMay 17, 2019
Semantic Analysis of Traffic Camera Data: Topic Signal Extraction and Anomalous Event Detection

Jeffrey Liu, Andrew Weinert, Saurabh Amin

Traffic Management Centers (TMCs) routinely use traffic cameras to provide situational awareness regarding traffic, road, and weather conditions. Camera footage is quite useful for a variety of diagnostic purposes; yet, most footage is kept for only a few days, if at all. This is largely due to the fact that currently, identification of notable footage is done via manual review by human operators---a laborious and inefficient process. In this article, we propose a semantics-oriented approach to analyzing sequential image data, and demonstrate its application for automatic detection of real-world, anomalous events in weather and traffic conditions. Our approach constructs semantic vector representations of image contents from textual labels which can be easily obtained from off-the-shelf, pretrained image labeling software. These semantic label vectors are used to construct semantic topic signals---time series representations of physical processes---using the Latent Dirichlet Allocation (LDA) topic model. By detecting anomalies in the topic signals, we identify notable footage corresponding to winter storms and anomalous traffic congestion. In validation against real-world events, anomaly detection using semantic topic signals significantly outperforms detection using any individual label signal.

CVSep 27, 2018
Semantic Topic Analysis of Traffic Camera Images

Jeffrey Liu, Andrew Weinert, Saurabh Amin

Traffic cameras are commonly deployed monitoring components in road infrastructure networks, providing operators visual information about conditions at critical points in the network. However, human observers are often limited in their ability to process simultaneous information sources. Recent advancements in computer vision, driven by deep learning methods, have enabled general object recognition, unlocking opportunities for camera-based sensing beyond the existing human observer paradigm. In this paper, we present a Natural Language Processing (NLP)-inspired approach, entitled Bag-of-Label-Words (BoLW), for analyzing image data sets using exclusively textual labels. The BoLW model represents the data in a conventional matrix form, enabling data compression and decomposition techniques, while preserving semantic interpretability. We apply the Latent Dirichlet Allocation (LDA) topic model to decompose the label data into a small number of semantic topics. To illustrate our approach, we use freeway camera images collected from the Boston area between December 2017-January 2018. We analyze the cameras' sensitivity to weather events; identify temporal traffic patterns; and analyze the impact of infrequent events, such as the winter holidays and the "bomb cyclone" winter storm. This study demonstrates the flexibility of our approach, which allows us to analyze weather events and freeway traffic using only traffic camera image labels.