MLOct 30, 2022
Distributionally Robust Domain AdaptationAkram S. Awad, George K. Atia
Domain Adaptation (DA) has recently received significant attention due to its potential to adapt a learning model across source and target domains with mismatched distributions. Since DA methods rely exclusively on the given source and target domain samples, they generally yield models that are vulnerable to noise and unable to adapt to unseen samples from the target domain, which calls for DA methods that guarantee the robustness and generalization of the learned models. In this paper, we propose DRDA, a distributionally robust domain adaptation method. DRDA leverages a distributionally robust optimization (DRO) framework to learn a robust decision function that minimizes the worst-case target domain risk and generalizes to any sample from the target domain by transferring knowledge from a given labeled source domain sample. We utilize the Maximum Mean Discrepancy (MMD) metric to construct an ambiguity set of distributions that provably contains the source and target domain distributions with high probability. Hence, the risk is shown to upper bound the out-of-sample target domain loss. Our experimental results demonstrate that our formulation outperforms existing robust learning approaches.
LGOct 29, 2023
Automaton Distillation: Neuro-Symbolic Transfer Learning for Deep Reinforcement LearningSuraj Singireddy, Precious Nwaorgu, Andre Beckus et al.
Reinforcement learning (RL) is a powerful tool for finding optimal policies in sequential decision processes. However, deep RL methods have two weaknesses: collecting the amount of agent experience required for practical RL problems is prohibitively expensive, and the learned policies exhibit poor generalization on tasks outside the training data distribution. To mitigate these issues, we introduce automaton distillation, a form of neuro-symbolic transfer learning in which Q-value estimates from a teacher are distilled into a low-dimensional representation in the form of an automaton. We then propose methods for generating Q-value estimates where symbolic information is extracted from a teacher's Deep Q-Network (DQN). The resulting Q-value estimates are used to bootstrap learning in the target discrete and continuous environment via a modified DQN and Twin-Delayed Deep Deterministic (TD3) loss function, respectively. We demonstrate that automaton distillation decreases the time required to find optimal policies for various decision tasks in new environments, even in a target environment different in structure from the source environment.
LGMar 15, 2022
A Differentiable Approach to Combinatorial Optimization using Dataless Neural NetworksIsmail R. Alkhouri, George K. Atia, Alvaro Velasquez
The success of machine learning solutions for reasoning about discrete structures has brought attention to its adoption within combinatorial optimization algorithms. Such approaches generally rely on supervised learning by leveraging datasets of the combinatorial structures of interest drawn from some distribution of problem instances. Reinforcement learning has also been employed to find such structures. In this paper, we propose a radically different approach in that no data is required for training the neural networks that produce the solution. In particular, we reduce the combinatorial optimization problem to a neural network and employ a dataless training scheme to refine the parameters of the network such that those parameters yield the structure of interest. We consider the combinatorial optimization problems of finding maximum independent sets and maximum cliques in a graph. In principle, since these problems belong to the NP-hard complexity class, our proposed approach can be used to solve any other NP-hard problem. Additionally, we propose a universal graph reduction procedure to handle large scale graphs. The reduction exploits community detection for graph partitioning and is applicable to any graph type and/or density. Experimental evaluation on both synthetic graphs and real-world benchmarks demonstrates that our method performs on par with or outperforms state-of-the-art heuristic, reinforcement learning, and machine learning based methods without requiring any data.
LGDec 1, 2025
Generative Modeling with Continuous Flows: Sample Complexity of Flow MatchingMudit Gaur, Prashant Trivedi, Shuchin Aeron et al.
Flow matching has recently emerged as a promising alternative to diffusion-based generative models, offering faster sampling and simpler training by learning continuous flows governed by ordinary differential equations. Despite growing empirical success, the theoretical understanding of flow matching remains limited, particularly in terms of sample complexity results. In this work, we provide the first analysis of the sample complexity for flow-matching based generative models without assuming access to the empirical risk minimizer (ERM) of the loss function for estimating the velocity field. Under standard assumptions on the loss function for velocity field estimation and boundedness of the data distribution, we show that a sufficiently expressive neural network can learn a velocity field such that with $\mathcal{O}(ε^{-4})$ samples, such that the Wasserstein-2 distance between the learned and the true distribution is less than $\mathcal{O}(ε)$. The key technical idea is to decompose the velocity field estimation error into neural-network approximation error, statistical error due to the finite sample size, and optimization error due to the finite number of optimization steps for estimating the velocity field. Each of these terms are then handled via techniques that may be of independent interest.
LGDec 22, 2025
Scaling Online Distributionally Robust Reinforcement Learning: Sample-Efficient Guarantees with General Function ApproximationDebamita Ghosh, George K. Atia, Yue Wang
The deployment of reinforcement learning (RL) agents in real-world applications is often hindered by performance degradation caused by mismatches between training and deployment environments. Distributionally robust RL (DR-RL) addresses this issue by optimizing worst-case performance over an uncertainty set of transition dynamics. However, existing work typically relies on substantial prior knowledge-such as access to a generative model or a large offline dataset-and largely focuses on tabular methods that do not scale to complex domains. We overcome these limitations by proposing an online DR-RL algorithm with general function approximation that learns an optimal robust policy purely through interaction with the environment, without requiring prior models or offline data, enabling deployment in high-dimensional tasks. We further provide a theoretical analysis establishing a near-optimal sublinear regret bound under a total variation uncertainty set, demonstrating the sample efficiency and effectiveness of our method.
LGJan 7, 2025
Align-Pro: A Principled Approach to Prompt Optimization for LLM AlignmentPrashant Trivedi, Souradip Chakraborty, Avinash Reddy et al.
The alignment of large language models (LLMs) with human values is critical as these models become increasingly integrated into various societal and decision-making processes. Traditional methods, such as reinforcement learning from human feedback (RLHF), achieve alignment by fine-tuning model parameters, but these approaches are often computationally expensive and impractical when models are frozen or inaccessible for parameter modification. In contrast, prompt optimization is a viable alternative to RLHF for LLM alignment. While the existing literature has shown empirical promise of prompt optimization, its theoretical underpinning remains under-explored. We address this gap by formulating prompt optimization as an optimization problem and try to provide theoretical insights into the optimality of such a framework. To analyze the performance of the prompt optimization, we study theoretical suboptimality bounds and provide insights in terms of how prompt optimization depends upon the given prompter and target model. We also provide empirical validation through experiments on various datasets, demonstrating that prompt optimization can effectively align LLMs, even when parameter fine-tuning is not feasible.
LGAug 5, 2025
ORVIT: Near-Optimal Online Distributionally Robust Reinforcement LearningDebamita Ghosh, George K. Atia, Yue Wang
We investigate reinforcement learning (RL) in the presence of distributional mismatch between training and deployment, where policies trained in simulators often underperform in practice due to mismatches between training and deployment conditions, and thereby reliable guarantees on real-world performance are essential. Distributionally robust RL addresses this issue by optimizing worst-case performance over an uncertainty set of environments and providing an optimized lower bound on deployment performance. However, existing studies typically assume access to either a generative model or offline datasets with broad coverage of the deployment environment-assumptions that limit their practicality in unknown environments without prior knowledge. In this work, we study a more practical and challenging setting: online distributionally robust RL, where the agent interacts only with a single unknown training environment while seeking policies that are robust with respect to an uncertainty set around this nominal model. We consider general $f$-divergence-based ambiguity sets, including $χ^2$ and KL divergence balls, and design a computationally efficient algorithm that achieves sublinear regret for the robust control objective under minimal assumptions, without requiring generative or offline data access. Moreover, we establish a corresponding minimax lower bound on the regret of any online algorithm, demonstrating the near-optimality of our method. Experiments across diverse environments with model misspecification show that our approach consistently improves worst-case performance and aligns with the theoretical guarantees.
LGMay 24, 2025
Pessimism Principle Can Be Effective: Towards a Framework for Zero-Shot Transfer Reinforcement LearningChi Zhang, Ziying Jia, George K. Atia et al.
Transfer reinforcement learning aims to derive a near-optimal policy for a target environment with limited data by leveraging abundant data from related source domains. However, it faces two key challenges: the lack of performance guarantees for the transferred policy, which can lead to undesired actions, and the risk of negative transfer when multiple source domains are involved. We propose a novel framework based on the pessimism principle, which constructs and optimizes a conservative estimation of the target domain's performance. Our framework effectively addresses the two challenges by providing an optimized lower bound on target performance, ensuring safe and reliable decisions, and by exhibiting monotonic improvement with respect to the quality of the source domains, thereby avoiding negative transfer. We construct two types of conservative estimations, rigorously characterize their effectiveness, and develop efficient distributed algorithms with convergence guarantees. Our framework provides a theoretically sound and practically robust solution for transfer learning in reinforcement learning.
MLMar 9
Robust Transfer Learning with Side InformationAkram S. Awad, Shihab Ahmed, Yue Wang et al.
Robust Markov Decision Processes (MDPs) address environmental shift through distributionally robust optimization (DRO) by finding an optimal worst-case policy within an uncertainty set of transition kernels. However, standard DRO approaches require enlarging the uncertainty set under large shifts, which leads to overly conservative and pessimistic policies. In this paper, we propose a framework for transfer under environment shift that derives a robust target-domain policy via estimate-centered uncertainty sets, constructed through constrained estimation that integrates limited target samples with side information about the source-target dynamics. The side information includes bounds on feature moments, distributional distances, and density ratios, yielding improved kernel estimates and tighter uncertainty sets. The side information includes bounds on feature moments, distributional distances, and density ratios, yielding improved kernel estimates and tighter uncertainty sets. Error bounds and convergence results are established for both robust and non-robust value functions. Moreover, we provide a finite-sample guarantee on the learned robust policy and analyze the robust sub-optimality gap. Under mild low-dimensional structure on the transition model, the side information reduces this gap and improves sample efficiency. We assess the performance of our approach across OpenAI Gym environments and classic control problems, consistently demonstrating superior target-domain performance over state-of-the-art robust and non-robust baselines.
LGAug 4, 2025
Online Robust Multi-Agent Reinforcement Learning under Model UncertaintiesZain Ulabedeen Farhat, Debamita Ghosh, George K. Atia et al.
Well-trained multi-agent systems can fail when deployed in real-world environments due to model mismatches between the training and deployment environments, caused by environment uncertainties including noise or adversarial attacks. Distributionally Robust Markov Games (DRMGs) enhance system resilience by optimizing for worst-case performance over a defined set of environmental uncertainties. However, current methods are limited by their dependence on simulators or large offline datasets, which are often unavailable. This paper pioneers the study of online learning in DRMGs, where agents learn directly from environmental interactions without prior data. We introduce the {\it Robust Optimistic Nash Value Iteration (RONAVI)} algorithm and provide the first provable guarantees for this setting. Our theoretical analysis demonstrates that the algorithm achieves low regret and efficiently finds the optimal robust policy for uncertainty sets measured by Total Variation divergence and Kullback-Leibler divergence. These results establish a new, practical path toward developing truly robust multi-agent systems.
LGMay 15, 2023
Scalable and Robust Tensor Ring Decomposition for Large-scale DataYicong He, George K. Atia
Tensor ring (TR) decomposition has recently received increased attention due to its superior expressive performance for high-order tensors. However, the applicability of traditional TR decomposition algorithms to real-world applications is hindered by prevalent large data sizes, missing entries, and corruption with outliers. In this work, we propose a scalable and robust TR decomposition algorithm capable of handling large-scale tensor data with missing entries and gross corruptions. We first develop a novel auto-weighted steepest descent method that can adaptively fill the missing entries and identify the outliers during the decomposition process. Further, taking advantage of the tensor ring model, we develop a novel fast Gram matrix computation (FGMC) approach and a randomized subtensor sketching (RStS) strategy which yield significant reduction in storage and computational complexity. Experimental results demonstrate that the proposed method outperforms existing TR decomposition methods in the presence of outliers, and runs significantly faster than existing robust tensor completion algorithms.
LGAug 5, 2021
BOSS: Bidirectional One-Shot Synthesis of Adversarial ExamplesIsmail R. Alkhouri, Alvaro Velasquez, George K. Atia
The design of additive imperceptible perturbations to the inputs of deep classifiers to maximize their misclassification rates is a central focus of adversarial machine learning. An alternative approach is to synthesize adversarial examples from scratch using GAN-like structures, albeit with the use of large amounts of training data. By contrast, this paper considers one-shot synthesis of adversarial examples; the inputs are synthesized from scratch to induce arbitrary soft predictions at the output of pre-trained models, while simultaneously maintaining high similarity to specified inputs. To this end, we present a problem that encodes objectives on the distance between the desired and output distributions of the trained model and the similarity between such inputs and the synthesized examples. We prove that the formulated problem is NP-complete. Then, we advance a generative approach to the solution in which the adversarial examples are obtained as the output of a generative network whose parameters are iteratively updated by optimizing surrogate loss functions for the dual-objective. We demonstrate the generality and versatility of the framework and approach proposed through applications to the design of targeted adversarial attacks, generation of decision boundary samples, and synthesis of low confidence classification inputs. The approach is further extended to an ensemble of models with different soft output specifications. The experimental results verify that the targeted and confidence reduction attack methods developed perform on par with state-of-the-art algorithms.
LGJun 19, 2021
Coarse to Fine Two-Stage Approach to Robust Tensor Completion of Visual DataYicong He, George K. Atia
Tensor completion is the problem of estimating the missing values of high-order data from partially observed entries. Data corruption due to prevailing outliers poses major challenges to traditional tensor completion algorithms, which catalyzed the development of robust algorithms that alleviate the effect of outliers. However, existing robust methods largely presume that the corruption is sparse, which may not hold in practice. In this paper, we develop a two-stage robust tensor completion approach to deal with tensor completion of visual data with a large amount of gross corruption. A novel coarse-to-fine framework is proposed which uses a global coarse completion result to guide a local patch refinement process. To efficiently mitigate the effect of a large number of outliers on tensor recovery, we develop a new M-estimator-based robust tensor ring recovery method which can adaptively identify the outliers and alleviate their negative effect in the optimization. The experimental results demonstrate the superior performance of the proposed approach over state-of-the-art robust algorithms for tensor completion.
CVMay 30, 2021
Patch Tracking-based Streaming Tensor Ring Completion for Visual Data RecoveryYicong He, George K. Atia
Tensor completion aims to recover the missing entries of a partially observed tensor by exploiting its low-rank structure, and has been applied to visual data recovery. In applications where the data arrives sequentially such as streaming video completion, the missing entries of the tensor need to be dynamically recovered in a streaming fashion. Traditional streaming tensor completion algorithms treat the entire visual data as a tensor, which may not work satisfactorily when there is a big change in the tensor subspace along the temporal dimension, such as due to strong motion across the video frames. In this paper, we develop a novel patch tracking-based streaming tensor ring completion framework for visual data recovery. Given a newly incoming frame, small patches are tracked from the previous frame. Meanwhile, for each tracked patch, a patch tensor is constructed by stacking similar patches from the new frame. Patch tensors are then completed using a streaming tensor ring completion algorithm, and the incoming frame is recovered using the completed patch tensors. We propose a new patch tracking strategy that can accurately and efficiently track the patches with missing data. Further, a new streaming tensor ring completion algorithm is proposed which can efficiently and accurately update the latent core tensors and complete the missing entries of the patch tensors. Extensive experimental results demonstrate the superior performance of the proposed algorithms compared with both batch and streaming state-of-the-art tensor completion methods.
AIDec 3, 2020
Steady-State Planning in Expected Reward Multichain MDPsGeorge K. Atia, Andre Beckus, Ismail Alkhouri et al.
The planning domain has experienced increased interest in the formal synthesis of decision-making policies. This formal synthesis typically entails finding a policy which satisfies formal specifications in the form of some well-defined logic. While many such logics have been proposed with varying degrees of expressiveness and complexity in their capacity to capture desirable agent behavior, their value is limited when deriving decision-making policies which satisfy certain types of asymptotic behavior in general system models. In particular, we are interested in specifying constraints on the steady-state behavior of an agent, which captures the proportion of time an agent spends in each state as it interacts for an indefinite period of time with its environment. This is sometimes called the average or expected behavior of the agent and the associated planning problem is faced with significant challenges unless strong restrictions are imposed on the underlying model in terms of the connectivity of its graph structure. In this paper, we explore this steady-state planning problem that consists of deriving a decision-making policy for an agent such that constraints on its steady-state behavior are satisfied. A linear programming solution for the general case of multichain Markov Decision Processes (MDPs) is proposed and we prove that optimal solutions to the proposed programs yield stationary policies with rigorous guarantees of behavior.
LGOct 22, 2020
Robust Low-tubal-rank Tensor Completion based on Tensor Factorization and Maximum Correntopy CriterionYicong He, George K. Atia
The goal of tensor completion is to recover a tensor from a subset of its entries, often by exploiting its low-rank property. Among several useful definitions of tensor rank, the low-tubal-rank was shown to give a valuable characterization of the inherent low-rank structure of a tensor. While some low-tubal-rank tensor completion algorithms with favorable performance have been recently proposed, these algorithms utilize second-order statistics to measure the error residual, which may not work well when the observed entries contain large outliers. In this paper, we propose a new objective function for low-tubal-rank tensor completion, which uses correntropy as the error measure to mitigate the effect of the outliers. To efficiently optimize the proposed objective, we leverage a half-quadratic minimization technique whereby the optimization is transformed to a weighted low-tubal-rank tensor factorization problem. Subsequently, we propose two simple and efficient algorithms to obtain the solution and provide their convergence and complexity analysis. Numerical results using both synthetic and real data demonstrate the robust and superior performance of the proposed algorithms.
SOC-PHSep 24, 2020
Sketch-based community detection in evolving networksAndre Beckus, George K. Atia
We consider an approach for community detection in time-varying networks. At its core, this approach maintains a small sketch graph to capture the essential community structure found in each snapshot of the full network. We demonstrate how the sketch can be used to explicitly identify six key community events which typically occur during network evolution: growth, shrinkage, merging, splitting, birth and death. Based on these detection techniques, we formulate a community detection algorithm which can process a network concurrently exhibiting all processes. One advantage afforded by the sketch-based algorithm is the efficient handling of large networks. Whereas detecting events in the full graph may be computationally expensive, the small size of the sketch allows changes to be quickly assessed. A second advantage occurs in networks containing clusters of disproportionate size. The sketch is constructed such that there is equal representation of each cluster, thus reducing the possibility that the small clusters are lost in the estimate. We present a new standardized benchmark based on the stochastic block model which models the addition and deletion of nodes, as well as the birth and death of communities. When coupled with existing benchmarks, this new benchmark provides a comprehensive suite of tests encompassing all six community events. We provide analysis and a set of numerical results demonstrating the advantages of our approach both in run time and in the handling of small clusters.
CVJul 6, 2018
Multi-modal Non-line-of-sight Passive ImagingAndre Beckus, Alexandru Tamasan, George K. Atia
We consider the non-line-of-sight (NLOS) imaging of an object using the light reflected off a diffusive wall. The wall scatters incident light such that a lens is no longer useful to form an image. Instead, we exploit the 4D spatial coherence function to reconstruct a 2D projection of the obscured object. The approach is completely passive in the sense that no control over the light illuminating the object is assumed and is compatible with the partially coherent fields ubiquitous in both the indoor and outdoor environments. We formulate a multi-criteria convex optimization problem for reconstruction, which fuses the reflected field's intensity and spatial coherence information at different scales. Our formulation leverages established optics models of light propagation and scattering and exploits the sparsity common to many images in different bases. We also develop an algorithm based on the alternating direction method of multipliers to efficiently solve the convex program proposed. A means for analyzing the null space of the measurement matrices is provided as well as a means for weighting the contribution of individual measurements to the reconstruction. This paper holds promise to advance passive imaging in the challenging NLOS regimes in which the intensity does not necessarily retain distinguishable features and provides a framework for multi-modal information fusion for efficient scene reconstruction.