Ahmed Attia

OC
h-index12
12papers
39citations
Novelty47%
AI Score46

12 Papers

86.5CLMar 17Code
Improving Low-Resource Machine Translation via Round-Trip Reinforcement Learning

Ahmed Attia, Alham Fikri Aji

Low-resource machine translation (MT) has gained increasing attention as parallel data from low-resource language communities is collected, but many approaches for improving low-resource MT remain underexplored. We investigate a self-supervised reinforcement learning fine-tuning for translation in low-resource settings using round-trip bootstrapping with the No Language Left Behind (NLLB) family of models. Our approach translates English into a target low-resource language and then back into English, using a combination of chrF++ and BLEU as the reward function on the reconstructed English sentences. Using the NLLB-MD dataset, we evaluate both the 600M and 1.3B parameter NLLB models and observe consistent improvements for the following languages: Central Aymara, Friulian, Wolof, Dyula, Bhojpuri and Russian. Qualitative inspection of translation outputs indicates increased fluency and semantic fidelity. We argue that our method can further benefit from scale, enabling models to increasingly leverage their pretrained knowledge and continue self-improving. Code available at: https://github.com/Copticoder/MT-via-Round-Trip-RL

NAJan 2, 2016
The Reduced-Order Hybrid Monte Carlo Sampling Smoother

Ahmed Attia, Razvan Stefanescu, Adrian Sandu

Hybrid Monte-Carlo (HMC) sampling smoother is a fully non-Gaussian four-dimensional data assimilation algorithm that works by directly sampling the posterior distribution formulated in the Bayesian framework. The smoother in its original formulation is computationally expensive due to the intrinsic requirement of running the forward and adjoint models repeatedly. Here we present computationally efficient versions of the HMC sampling smoother based on reduced-order approximations of the underlying model dynamics. The schemes developed herein are tested numerically using the shallow-water equations model on Cartesian coordinates. The results reveal that the reduced-order versions of the smoother are capable of accurately capturing the posterior probability density, while being significantly faster than the original full order formulation.

CEJun 11, 2018
Goal-Oriented Optimal Design of Experiments for Large-Scale Bayesian Linear Inverse Problems

Ahmed Attia, Alen Alexanderian, Arvind K. Saibaba

We develop a framework for goal-oriented optimal design of experiments (GOODE) for large-scale Bayesian linear inverse problems governed by PDEs. This framework differs from classical Bayesian optimal design of experiments (ODE) in the following sense: we seek experimental designs that minimize the posterior uncertainty in the experiment end-goal, e.g., a quantity of interest (QoI), rather than the estimated parameter itself. This is suitable for scenarios in which the solution of an inverse problem is an intermediate step and the estimated parameter is then used to compute a QoI. In such problems, a GOODE approach has two benefits: the designs can avoid wastage of experimental resources by a targeted collection of data, and the resulting design criteria are computationally easier to evaluate due to the often low-dimensionality of the QoIs. We present two modified design criteria, A-GOODE and D-GOODE, which are natural analogues of classical Bayesian A- and D-optimal criteria. We analyze the connections to other ODE criteria, and provide interpretations for the GOODE criteria by using tools from information theory. Then, we develop an efficient gradient-based optimization framework for solving the GOODE optimization problems. Additionally, we present comprehensive numerical experiments testing the various aspects of the presented approach. The driving application is the optimal placement of sensors to identify the source of contaminants in a diffusion and transport problem. We enforce sparsity of the sensor placements using an $\ell_1$-norm penalty approach, and propose a practical strategy for specifying the associated penalty parameter.

NAAug 12, 2025
Robust optimal design of large-scale Bayesian nonlinear inverse problems

Abhijit Chowdhary, Ahmed Attia, Alen Alexanderian

We consider robust optimal experimental design (ROED) for nonlinear Bayesian inverse problems governed by partial differential equations (PDEs). An optimal design is one that maximizes some utility quantifying the quality of the solution of an inverse problem. However, the optimal design is dependent on elements of the inverse problem such as the simulation model, the prior, or the measurement error model. ROED aims to produce an optimal design that is aware of the additional uncertainties encoded in the inverse problem and remains optimal even after variations in them. We follow a worst-case scenario approach to develop a new framework for robust optimal design of nonlinear Bayesian inverse problems. The proposed framework a) is scalable and designed for infinite-dimensional Bayesian nonlinear inverse problems constrained by PDEs; b) develops efficient approximations of the utility, namely, the expected information gain; c) employs eigenvalue sensitivity techniques to develop analytical forms and efficient evaluation methods of the gradient of the utility with respect to the uncertainties we wish to be robust against; and d) employs a probabilistic optimization paradigm that properly defines and efficiently solves the resulting combinatorial max-min optimization problem. The effectiveness of the proposed approach is illustrated for optimal sensor placement problem in an inverse problem governed by an elliptic PDE.

LGApr 28, 2025Code
Accurate and Diverse LLM Mathematical Reasoning via Automated PRM-Guided GFlowNets

Adam Younsi, Ahmed Attia, Abdalgader Abubaker et al.

Achieving both accuracy and diverse reasoning remains challenging for Large Language Models (LLMs) in complex domains like mathematics. A key bottleneck is evaluating intermediate reasoning steps to guide generation without costly human annotations. To address this, we first introduce a novel Process Reward Model (PRM) trained automatically using Monte Carlo Tree Search coupled with a similarity-based data augmentation technique, effectively capturing step-level reasoning quality. Leveraging this PRM, we then adapt Generative Flow Networks (GFlowNets) to operate at the reasoning step level. Unlike traditional reinforcement learning focused on maximizing a single reward, GFlowNets naturally sample diverse, high-quality solutions proportional to their rewards, as measured by our PRM. Empirical evaluation shows strong improvements in both accuracy and solution diversity on challenging mathematical benchmarks (e.g., +2.59% absolute accuracy on MATH Level 5 for Llama3.2-3B), with effective generalization to unseen datasets (+9.4\% absolute on SAT MATH). Furthermore, we benchmark our PRM against existing open-source reward models, demonstrating superior alignment with reasoning quality and more consistent guidance for downstream generation. Our work demonstrates the potential of PRM-guided, step-level GFlowNets for developing more robust and versatile mathematical reasoning in LLMs.

OCJan 16
A Probabilistic Approach to Trajectory-Based Optimal Experimental Design

Ahmed Attia

We present a novel probabilistic approach for optimal path experimental design. In this approach a discrete path optimization problem is defined on a static navigation mesh, and trajectories are modeled as random variables governed by a parametric Markov policy. The discrete path optimization problem is then replaced with an equivalent stochastic optimization problem over the policy parameters, resulting in an optimal probability model that samples estimates of the optimal discrete path. This approach enables exploration of the utility function's distribution tail and treats the utility function of the design as a black box, making it applicable to linear and nonlinear inverse problems and beyond experimental design. Numerical verification and analysis are carried out by using a parameter identification problem widely used in model-based optimal experimental design.

OCJun 9, 2024
Probabilistic Approach to Black-Box Binary Optimization with Budget Constraints: Application to Sensor Placement

Ahmed Attia

We present a fully probabilistic approach for solving binary optimization problems with black-box objective functions and with budget constraints. In the probabilistic approach, the optimization variable is viewed as a random variable and is associated with a parametric probability distribution. The original optimization problem is replaced with an optimization over the expected value of the original objective, which is then optimized over the probability distribution parameters. The resulting optimal parameter (optimal policy) is used to sample the binary space to produce estimates of the optimal solution(s) of the original binary optimization problem. The probability distribution is chosen from the family of Bernoulli models because the optimization variable is binary. The optimization constraints generally restrict the feasibility region. This can be achieved by modeling the random variable with a conditional distribution given satisfiability of the constraints. Thus, in this work we develop conditional Bernoulli distributions to model the random variable conditioned by the total number of nonzero entries, that is, the budget constraint. This approach (a) is generally applicable to binary optimization problems with nonstochastic black-box objective functions and budget constraints; (b) accounts for budget constraints by employing conditional probabilities that sample only the feasible region and thus considerably reduces the computational cost compared with employing soft constraints; and (c) does not employ soft constraints and thus does not require tuning of a regularization parameter, for example to promote sparsity, which is challenging in sensor placement optimization problems. The proposed approach is verified numerically by using an idealized bilinear binary optimization problem and is validated by using a sensor placement experiment in a parameter identification setup.

OCMay 5, 2023
Robust A-Optimal Experimental Design for Bayesian Inverse Problems

Ahmed Attia, Sven Leyffer, Todd Munson

Optimal design of experiments for Bayesian inverse problems has recently gained wide popularity and attracted much attention, especially in the computational science and Bayesian inversion communities. An optimal design maximizes a predefined utility function that is formulated in terms of the elements of an inverse problem, an example being optimal sensor placement for parameter identification. The state-of-the-art algorithmic approaches following this simple formulation generally overlook misspecification of the elements of the inverse problem, such as the prior or the measurement uncertainties. This work presents an efficient algorithmic approach for designing optimal experimental design schemes for Bayesian inverse problems such that the optimal design is robust to misspecification of elements of the inverse problem. Specifically, we consider a worst-case scenario approach for the uncertain or misspecified parameters, formulate robust objectives, and propose an algorithmic approach for optimizing such objectives. Both relaxation and stochastic solution approaches are discussed with detailed analysis and insight into the interpretation of the problem and the proposed algorithmic approach. Extensive numerical experiments to validate and analyze the proposed approach are carried out for sensor placement in a parameter identification problem.

OCJan 15, 2021
Stochastic Learning Approach to Binary Optimization for Optimal Design of Experiments

Ahmed Attia, Sven Leyffer, Todd Munson

We present a novel stochastic approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the regularized optimality criterion, is cast into a stochastic objective function in the form of an expectation over a multivariate Bernoulli distribution. The probabilistic objective is then solved by using a stochastic optimization routine to find an optimal observational policy. The proposed approach is analyzed from an optimization perspective and also from a machine learning perspective with correspondence to policy gradient reinforcement learning. The approach is demonstrated numerically by using an idealized two-dimensional Bayesian linear inverse problem, and validated by extensive numerical experiments carried out for sensor placement in a parameter identification setup.

MEJan 2, 2018
A Machine Learning Approach to Adaptive Covariance Localization

Azam Moosavi, Ahmed Attia, Adrian Sandu

Data assimilation plays a key role in large-scale atmospheric weather forecasting, where the state of the physical system is estimated from model outputs and observations, and is then used as initial condition to produce accurate future forecasts. The Ensemble Kalman Filter (EnKF) provides a practical implementation of the statistical solution of the data assimilation problem and has gained wide popularity as. This success can be attributed to its simple formulation and ease of implementation. EnKF is a Monte-Carlo algorithm that solves the data assimilation problem by sampling the probability distributions involved in Bayes theorem. Because of this, all flavors of EnKF are fundamentally prone to sampling errors when the ensemble size is small. In typical weather forecasting applications, the model state space has dimension $10^{9}-10^{12}$, while the ensemble size typically ranges between $30-100$ members. Sampling errors manifest themselves as long-range spurious correlations and have been shown to cause filter divergence. To alleviate this effect covariance localization dampens spurious correlations between state variables located at a large distance in the physical space, via an empirical distance-dependent function. The quality of the resulting analysis and forecast is greatly influenced by the choice of the localization function parameters, e.g., the radius of influence. The localization radius is generally tuned empirically to yield desirable results.This work, proposes two adaptive algorithms for covariance localization in the EnKF framework, both based on a machine learning approach. The first algorithm adapts the localization radius in time, while the second algorithm tunes the localization radius in both time and space. Numerical experiments carried out with the Lorenz-96 model, and a quasi-geostrophic model, reveal the potential of the proposed machine learning approaches.

COAug 18, 2016
Cluster Sampling Filters for Non-Gaussian Data Assimilation

Ahmed Attia, Azam Moosavi, Adrian Sandu

This paper presents a fully non-Gaussian version of the Hamiltonian Monte Carlo (HMC) sampling filter. The Gaussian prior assumption in the original HMC filter is relaxed. Specifically, a clustering step is introduced after the forecast phase of the filter, and the prior density function is estimated by fitting a Gaussian Mixture Model (GMM) to the prior ensemble. Using the data likelihood function, the posterior density is then formulated as a mixture density, and is sampled using a HMC approach (or any other scheme capable of sampling multimodal densities in high-dimensional subspaces). The main filter developed herein is named "cluster HMC sampling filter" (ClHMC). A multi-chain version of the ClHMC filter, namely MC-ClHMC is also proposed to guarantee that samples are taken from the vicinities of all probability modes of the formulated posterior. The new methodologies are tested using a quasi-geostrophic (QG) model with double-gyre wind forcing and bi-harmonic friction. Numerical results demonstrate the usefulness of using GMMs to relax the Gaussian prior assumption in the HMC filtering paradigm.

NAMay 18, 2015
A Hybrid Monte-Carlo Sampling Smoother for Four Dimensional Data Assimilation

Ahmed Attia, Vishwas Rao, Adrian Sandu

This paper constructs an ensemble-based sampling smoother for four-dimensional data assimilation using a Hybrid/Hamiltonian Monte-Carlo approach. The smoother samples efficiently from the posterior probability density of the solution at the initial time. Unlike the well-known ensemble Kalman smoother, which is optimal only in the linear Gaussian case, the proposed methodology naturally accommodates non-Gaussian errors and non-linear model dynamics and observation operators. Unlike the four-dimensional variational met\-hod, which only finds a mode of the posterior distribution, the smoother provides an estimate of the posterior uncertainty. One can use the ensemble mean as the minimum variance estimate of the state, or can use the ensemble in conjunction with the variational approach to estimate the background errors for subsequent assimilation windows. Numerical results demonstrate the advantages of the proposed method compared to the traditional variational and ensemble-based smoothing methods.