Mazen Ali

FA
h-index24
8papers
50citations
Novelty41%
AI Score40

8 Papers

NAMay 30, 2018
HT-AWGM: A Hierarchical Tucker-Adaptive Wavelet Galerkin Method for High Dimensional Elliptic Problems

Mazen Ali, Karsten Urban

This paper is concerned with the construction, analysis and realization of a numerical method to approximate the solution of high dimensional elliptic partial differential equations. We propose a new combination of an Adaptive Wavelet Galerkin Method (AWGM) and the well known Hierarchical Tensor (HT) format. The arising HT-AWGM is adaptive both in the wavelet representation of the low dimensional factors and in the tensor rank of the HT representation. The point of departure is an adaptive wavelet method for the HT format using approximate Richardson iterations from [1] and an AWGM method as described in [13]. HT-AWGM performs a sequence of Galerkin solves based upon a truncated preconditioned conjugate gradient (PCG) algorithm from [33] in combination with a tensor-based preconditioner from [3]. Our analysis starts by showing convergence of the truncated conjugate gradient method. The next step is to add routines realizing the adaptive refinement. The resulting HT-AWGM is analyzed concerning convergence and complexity. We show that the performance of the scheme asymptotically depends only on the desired tolerance with convergence rates depending on the Besov regularity of low dimensional quantities and the low rank tensor structure of the solution. The complexity in the ranks is algebraic with powers of four stemming from the complexity of the tensor truncation. Numerical experiments show the quantitative performance.

43.6PRMar 27
STN-GPR: A Singularity Tensor Network Framework for Efficient Option Pricing

Dominic Gribben, Carolina Allende, Alba Villarino et al.

We develop a tensor-network surrogate for option pricing, targeting large-scale portfolio revaluation problems arising in market risk management (e.g., VaR and Expected Shortfall computations). The method involves representing high-dimensional price surfaces in tensor-train (TT) form using TT-cross approximation, constructing the surrogate directly from black-box price evaluations without materializing the full training tensor. For inference, we use a Laplacian kernel and derive TT representations of the kernel matrix and its closed-form inverse in the noise-free setting, enabling TT-based Gaussian process regression without dense matrix factorization or iterative linear solves. We found that hyperparameter optimization consistently favors a large kernel length-scale and show that in this regime the GPR predictor reduces to multilinear interpolation for off-grid inputs; we also derive a low-rank TT representation for this limit. We evaluate the approach on five-asset basket options over an eight dimensional parameter space (asset spot levels, strike, interest rate, and time to maturity). For European geometric basket puts, the tensor surrogate achieves lower test error at shorter training times than standard GPR by scaling to substantially larger effective training sets. For American arithmetic basket puts trained on LSMC data, the surrogate exhibits more favorable scaling with training-set size while providing millisecond-level evaluation per query, with overall runtime dominated by data generation.

CVJul 10, 2025
Lightweight Cloud Masking Models for On-Board Inference in Hyperspectral Imaging

Mazen Ali, António Pereira, Fabio Gentile et al.

Cloud and cloud shadow masking is a crucial preprocessing step in hyperspectral satellite imaging, enabling the extraction of high-quality, analysis-ready data. This study evaluates various machine learning approaches, including gradient boosting methods such as XGBoost and LightGBM as well as convolutional neural networks (CNNs). All boosting and CNN models achieved accuracies exceeding 93%. Among the investigated models, the CNN with feature reduction emerged as the most efficient, offering a balance of high accuracy, low storage requirements, and rapid inference times on both CPUs and GPUs. Variations of this version, with only up to 597 trainable parameters, demonstrated the best trade-off in terms of deployment feasibility, accuracy, and computational efficiency. These results demonstrate the potential of lightweight artificial intelligence (AI) models for real-time hyperspectral image processing, supporting the development of on-board satellite AI systems for space-based applications.

FAJan 28, 2021
Approximation Theory of Tree Tensor Networks: Tensorized Multivariate Functions

Mazen Ali, Anthony Nouy

We study the approximation of multivariate functions with tensor networks (TNs), providing some answers to the following two questions: ``what are the approximation capabilities of TNs for functions from classical smoothness classes?'' and ``what are the properties of the class of functions that can be approximated with TNs with a certain performance?'' As a partial answer to the former, we show that TNs can (near to) optimally replicate $h$-uniform and $h$-adaptive spline approximation, for any smoothness order of the target function. Tensor networks thus exhibit universal expressivity w.r.t. isotropic, anisotropic and mixed smoothness spaces that is comparable with more general neural networks families such as deep rectified linear unit (ReLU) networks. Put differently, TNs have the capacity to (near to) optimally approximate many function classes -- without being adapted to the particular class in question. As a partial answer to the latter, as a candidate model class we consider approximation classes of TNs and show that these are (quasi-)Banach spaces, that many types of classical smoothness spaces are continuously embedded into said approximation classes and that TNs approximation classes are themselves not embedded in any classical smoothness space. In other words, TNs can efficiently approximate functions that lie beyond classical smoothness spaces.

FAJul 30, 2020
Approximation of Smoothness Classes by Deep Rectifier Networks

Mazen Ali, Anthony Nouy

We consider approximation rates of sparsely connected deep rectified linear unit (ReLU) and rectified power unit (RePU) neural networks for functions in Besov spaces $B^α_{q}(L^p)$ in arbitrary dimension $d$, on general domains. We show that \alert{deep rectifier} networks with a fixed activation function attain optimal or near to optimal approximation rates for functions in the Besov space $B^α_τ(L^τ)$ on the critical embedding line $1/τ=α/d+1/p$ for \emph{arbitrary} smoothness order $α>0$. Using interpolation theory, this implies that the entire range of smoothness classes at or above the critical line is (near to) optimally approximated by deep ReLU/RePU networks.

FAJun 30, 2020
Approximation Theory of Tree Tensor Networks: Tensorized Univariate Functions -- Part II

Mazen Ali, Anthony Nouy

We study the approximation by tensor networks (TNs) of functions from classical smoothness classes. The considered approximation tool combines a tensorization of functions in $L^p([0,1))$, which allows to identify a univariate function with a multivariate function (or tensor), and the use of tree tensor networks (the tensor train format) for exploiting low-rank structures of multivariate functions. The resulting tool can be interpreted as a feed-forward neural network, with first layers implementing the tensorization, interpreted as a particular featuring step, followed by a sum-product network with sparse architecture. In part I of this work, we presented several approximation classes associated with different measures of complexity of tensor networks and studied their properties. In this work (part II), we show how classical approximation tools, such as polynomials or splines (with fixed or free knots), can be encoded as a tensor network with controlled complexity. We use this to derive direct (Jackson) inequalities for the approximation spaces of tensor networks. This is then utilized to show that Besov spaces are continuously embedded into these approximation spaces. In other words, we show that arbitrary Besov functions can be approximated with optimal or near to optimal rate. We also show that an arbitrary function in the approximation class possesses no Besov smoothness, unless one limits the depth of the tensor network.

FAJun 30, 2020
Approximation Theory of Tree Tensor Networks: Tensorized Univariate Functions -- Part I

Mazen Ali, Anthony Nouy

We study the approximation of functions by tensor networks (TNs). We show that Lebesgue $L^p$-spaces in one dimension can be identified with tensor product spaces of arbitrary order through tensorization. We use this tensor product structure to define subsets of $L^p$ of rank-structured functions of finite representation complexity. These subsets are then used to define different approximation classes of tensor networks, associated with different measures of complexity. These approximation classes are shown to be quasi-normed linear spaces. We study some elementary properties and relationships of said spaces. In part II of this work, we will show that classical smoothness (Besov) spaces are continuously embedded into these approximation classes. We will also show that functions in these approximation classes do not possess any Besov smoothness, unless one restricts the depth of the tensor networks. The results of this work are both an analysis of the approximation spaces of TNs and a study of the expressivity of a particular type of neural networks (NN) -- namely feed-forward sum-product networks with sparse architecture. The input variables of this network result from the tensorization step, interpreted as a particular featuring step which can also be implemented with a neural network with a specific architecture. We point out interesting parallels to recent results on the expressivity of rectified linear unit (ReLU) networks -- currently one of the most popular type of NNs.

NASep 23, 2015
Reduced Basis Methods Based Upon Adaptive Snapshot Computations

Mazen Ali, Kristina Steih, Karsten Urban

We use asymptotically optimal \emph{adaptive} numerical methods (here specifically a wavelet scheme) for snapshot computations within the offline phase of the Reduced Basis Method (RBM). The resulting discretizations for each snapshot (i.e., parameter-dependent) do not permit the standard RB `truth space', but allow for error estimation of the RB approximation with respect to the exact solution of the considered parameterized partial differential equation. The residual-based a posteriori error estimators are computed by an adaptive dual wavelet expansion, which allows us to compute a surrogate of the dual norm of the residual. The resulting adaptive RBM is analyzed. We show the convergence of the resulting adaptive Greedy method. Numerical experiments for stationary and instationary problems underline the potential of this approach.