NANov 4, 2015
Low-rank methods for high-dimensional approximation and model order reductionAnthony Nouy
Tensor methods are among the most prominent tools for the numerical solution of high-dimensional problems where functions of multiple variables have to be approximated. These methods exploit the tensor structure of function spaces and apply to many problems in computational science which are formulated in tensor spaces, such as problems arising in stochastic calculus, uncertainty quantification or parametric analyses. Here, we present complexity reduction methods based on low-rank approximation methods. We analyze the problem of best approximation in subsets of low-rank tensors and discuss its connection with the problem of optimal model reduction in low-dimensional reduced spaces. We present different algorithms for computing approximations of a function in low-rank formats. In particular, we present constructive algorithms which are based either on a greedy construction of an approximation (with successive corrections in subsets of low-rank tensors) or on the greedy construction of tensor subspaces (for subspace-based low-rank formats). These algorithms can be applied for tensor compression, tensor completion or for the numerical solution of equations in low-rank tensor formats. A special emphasis is given to the solution of stochastic or parameter-dependent models. Different approaches are presented for the approximation of vector-valued or multivariate functions (identified with tensors), based on samples of the functions (black-box approaches) or on the models equations which are satisfied by the functions.
NAFeb 23, 2019
Tree-based tensor formatsAntonio Falco, Wolfgang Hackbusch, Anthony Nouy
The main goal of this paper is to study the topological properties of tensors in tree-based Tucker format. These formats include the Tucker format and the Hierarchical Tucker format. A property of the so-called minimal subspaces is used for obtaining a representation of tensors with either bounded or fixed tree-based rank in the underlying algebraic tensor space. We provide a new characterisation of minimal subspaces which extends the existing characterisations. We also introduce a definition of topological tensor spaces in tree-based format, with the introduction of a norm at each vertex of the tree, and prove the existence of best approximations from sets of tensors with bounded tree-based rank, under some assumptions on the norms weaker than in the existing results.
NADec 1, 2011
Proper Generalized Decomposition for Nonlinear Convex Problems in Tensor Banach SpacesAntonio Falco, Anthony Nouy
Tensor-based methods are receiving a growing interest in scientific computing for the numerical solution of problems defined in high dimensional tensor product spaces. A family of methods called Proper Generalized Decompositions methods have been recently introduced for the a priori construction of tensor approximations of the solution of such problems. In this paper, we give a mathematical analysis of a family of progressive and updated Proper Generalized Decompositions for a particular class of problems associated with the minimization of a convex functional over a reflexive tensor Banach space.
NANov 4, 2015
Low-rank tensor methods for model order reductionAnthony Nouy
Parameter-dependent models arise in many contexts such as uncertainty quantification, sensitivity analysis, inverse problems or optimization. Parametric or uncertainty analyses usually require the evaluation of an output of a model for many instances of the input parameters, which may be intractable for complex numerical models. A possible remedy consists in replacing the model by an approximate model with reduced complexity (a so called reduced order model) allowing a fast evaluation of output variables of interest. This chapter provides an overview of low-rank methods for the approximation of functions that are identified either with order-two tensors (for vector-valued functions) or higher-order tensors (for multivariate functions). Different approaches are presented for the computation of low-rank approximations, either based on samples of the function or on the equations that are satisfied by the function, the latter approaches including projection-based model order reduction methods. For multivariate functions, different notions of ranks and the corresponding low-rank approximation formats are introduced.
NAJan 27, 2016
Interpolation of inverse operators for preconditioning parameter-dependent equationsOlivier Zahm, Anthony Nouy
We propose a method for the construction of preconditioners of parameter-dependent matrices for the solution of large systems of parameter-dependent equations. The proposed method is an interpolation of the matrix inverse based on a projection of the identity matrix with respect to the Frobenius norm. Approximations of the Frobenius norm using random matrices are introduced in order to handle large matrices. The resulting statistical estimators of the Frobenius norm yield quasi-optimal projections that are controlled with high probability. Strategies for the adaptive selection of interpolation points are then proposed for different objectives in the context of projection-based model order reduction methods: the improvement of residual-based error estimators, the improvement of the projection on a given reduced approximation space, or the recycling of computations for sampling based model order reduction methods.
NADec 1, 2016
Dynamical model reduction method for solving parameter-dependent dynamical systemsMarie Billaud-Friess, Anthony Nouy
We propose a projection-based model order reduction method for the solution of parameter-dependent dynamical systems. The proposed method relies on the construction of time-dependent reduced spaces generated from evaluations of the solution of the full-order model at some selected parameters values. The approximation obtained by Galerkin projection is the solution of a reduced dynamical system with a modified flux which takes into account the time dependency of the reduced spaces. An a posteriori error estimate is derived and a greedy algorithm using this error estimate is proposed for the adaptive selection of parameters values. The resulting method can be interpreted as a dynamical low-rank approximation method with a subspace point of view and a uniform control of the error over the parameter set.
NAApr 19, 2018
Higher-order principal component analysis for the approximation of tensors in tree-based low-rank formatsAnthony Nouy
This paper is concerned with the approximation of tensors using tree-based tensor formats, which are tensor networks whose graphs are dimension partition trees. We consider Hilbert tensor spaces of multivariate functions defined on a product set equipped with a probability measure. This includes the case of multidimensional arrays corresponding to finite product sets. We propose and analyse an algorithm for the construction of an approximation using only point evaluations of a multivariate function, or evaluations of some entries of a multidimensional array. The algorithm is a variant of higher-order singular value decomposition which constructs a hierarchy of subspaces associated with the different nodes of the tree and a corresponding hierarchy of interpolation operators. Optimal subspaces are estimated using empirical principal component analysis of interpolations of partial random evaluations of the function. The algorithm is able to provide an approximation in any tree-based format with either a prescribed rank or a prescribed relative error, with a number of evaluations of the order of the storage complexity of the approximation format. Under some assumptions on the estimation of principal components, we prove that the algorithm provides either a quasi-optimal approximation with a given rank, or an approximation satisfying the prescribed relative error, up to constants depending on the tree and the properties of interpolation operators. The analysis takes into account the discretization errors for the approximation of infinite-dimensional tensors. Several numerical examples illustrate the main results and the behavior of the algorithm for the approximation of high-dimensional functions using hierarchical Tucker or tensor train tensor formats, and the approximation of univariate functions using tensorization.
NAJan 20, 2019
A multiscale method for semi-linear elliptic equations with localized uncertainties and non-linearitiesAnthony Nouy, Florent Pled
A multiscale numerical method is proposed for the solution of semi-linear elliptic stochastic partial differential equations with localized uncertainties and non-linearities, the uncertainties being modeled by a set of random parameters. It relies on a domain decomposition method which introduces several subdomains of interest (called patches) containing the different sources of uncertainties and non-linearities. An iterative algorithm is then introduced, which requires the solution of a sequence of linear global problems (with deterministic operators and uncertain right-hand sides), and non-linear local problems (with uncertain operators and/or right-hand sides) over the patches. Non-linear local problems are solved using an adaptive sampling-based least-squares method for the construction of sparse polynomial approximations of local solutions as functions of the random parameters. Consistency, convergence and robustness of the algorithm are proved under general assumptions on the semi-linear elliptic operator. A convergence acceleration technique (Aitken's dynamic relaxation) is also introduced to speed up the convergence of the algorithm. The performances of the proposed method are illustrated through numerical experiments carried out on a stationary non-linear diffusion-reaction problem.
DGMay 11, 2017
Principal bundle structure of matrix manifoldsMarie Billaud-Friess, Antonio Falco, Anthony Nouy
In this paper, we introduce a new geometric description of the manifolds of matrices of fixed rank. The starting point is a geometric description of the Grassmann manifold $\mathbb{G}_r(\mathbb{R}^k)$ of linear subspaces of dimension $r<k$ in $\mathbb{R}^k$ which avoids the use of equivalence classes. The set $\mathbb{G}_r(\mathbb{R}^k)$ is equipped with an atlas which provides it with the structure of an analytic manifold modelled on $\mathbb{R}^{(k-r)\times r}$. Then we define an atlas for the set $\mathcal{M}_r(\mathbb{R}^{k \times r})$ of full rank matrices and prove that the resulting manifold is an analytic principal bundle with base $\mathbb{G}_r(\mathbb{R}^k)$ and typical fibre $\mathrm{GL}_r$, the general linear group of invertible matrices in $\mathbb{R}^{k\times k}$. Finally, we define an atlas for the set $\mathcal{M}_r(\mathbb{R}^{n \times m})$ of non-full rank matrices and prove that the resulting manifold is an analytic principal bundle with base $\mathbb{G}_r(\mathbb{R}^n) \times \mathbb{G}_r(\mathbb{R}^m)$ and typical fibre $\mathrm{GL}_r$. The atlas of $\mathcal{M}_r(\mathbb{R}^{n \times m})$ is indexed on the manifold itself, which allows a natural definition of a neighbourhood for a given matrix, this neighbourhood being proved to possess the structure of a Lie group. Moreover, the set $\mathcal{M}_r(\mathbb{R}^{n \times m})$ equipped with the topology induced by the atlas is proven to be an embedded submanifold of the matrix space $\mathbb{R}^{n \times m}$ equipped with the subspace topology. The proposed geometric description then results in a description of the matrix space $\mathbb{R}^{n \times m}$, seen as the union of manifolds $\mathcal{M}_r(\mathbb{R}^{n \times m})$, as an analytic manifold equipped with a topology for which the matrix rank is a continuous map.
NAOct 30, 2016
Projection based model order reduction methods for the estimation of vector-valued variables of interestOlivier Zahm, Marie Billaud-Friess, Anthony Nouy
We propose and compare goal-oriented projection based model order reduction methods for the estimation of vector-valued functionals of the solution of parameter-dependent equations. The first projection method is a generalization of the classical primal-dual method to the case of vector-valued variables of interest. We highlight the role played by three reduced spaces: the approximation space and the test space associated to the primal variable, and the approximation space associated to the dual variable. Then we propose a Petrov-Galerkin projection method based on a saddle point problem involving an approximation space for the primal variable and an approximation space for an auxiliary variable. A goal-oriented choice of the latter space, defined as the sum of two spaces, allows us to improve the approximation of the variable of interest compared to a primal-dual method using the same reduced spaces. Then, for both approaches, we derive computable error estimates for the approximations of the variable of interest and we propose greedy algorithms for the goal-oriented construction of reduced spaces. The performance of the algorithms are illustrated on numerical examples and compared to standard (non goal-oriented) algorithms.
NAMar 5, 2019
Weakly intrusive low-rank approximation method for nonlinear parameter-dependent equationsLoic Giraldi, Anthony Nouy
This paper presents a weakly intrusive strategy for computing a low-rank approximation of the solution of a system of nonlinear parameter-dependent equations. The proposed strategy relies on a Newton-like iterative solver which only requires evaluations of the residual of the parameter-dependent equation and of a preconditioner (such as the differential of the residual) for instances of the parameters independently. The algorithm provides an approximation of the set of solutions associated with a possibly large number of instances of the parameters, with a computational complexity which can be orders of magnitude lower than when using the same Newton-like solver for all instances of the parameters. The reduction of complexity requires efficient strategies for obtaining low-rank approximations of the residual, of the preconditioner, and of the increment at each iteration of the algorithm. For the approximation of the residual and the preconditioner, weakly intrusive variants of the empirical interpolation method are introduced, which require evaluations of entries of the residual and the preconditioner. Then, an approximation of the increment is obtained by using a greedy algorithm for low-rank approximation, and a low-rank approximation of the iterate is finally obtained by using a truncated singular value decomposition. When the preconditioner is the differential of the residual, the proposed algorithm is interpreted as an inexact Newton solver for which a detailed convergence analysis is provided. Numerical examples illustrate the efficiency of the method.
NAMay 15, 2018
Tensor-based multiscale method for diffusion problems in quasi-periodic heterogeneous mediaQuentin Ayoul-Guilmard, Anthony Nouy, Christophe Binetruy
This paper proposes to address the issue of complexity reduction for the numerical simulation of multiscale media in a quasi-periodic setting. We consider a stationary elliptic diffusion equation defined on a domain $D$ such that $\overline{D}$ is the union of cells $\{\overline{D_i}\}_{i\in I}$ and we introduce a two-scale representation by identifying any function $v(x)$ defined on $D$ with a bi-variate function $v(i,y)$, where $i \in I$ relates to the index of the cell containing the point $x$ and $y \in Y$ relates to a local coordinate in a reference cell $Y$. We introduce a weak formulation of the problem in a broken Sobolev space $V(D)$ using a discontinuous Galerkin framework. The problem is then interpreted as a tensor-structured equation by identifying $V(D)$ with a tensor product space $\mathbb{R}^I \otimes V(Y)$ of functions defined over the product set $I\times Y$. Tensor numerical methods are then used in order to exploit approximability properties of quasi-periodic solutions by low-rank tensors.
17.6LGApr 16
Natural gradient descent with momentumAnthony Nouy, Agustín Somacal
We consider the problem of approximating a function by an element of a nonlinear manifold which admits a differentiable parametrization, typical examples being neural networks with differentiable activation functions or tensor networks. Natural gradient descent (NGD) for the optimization of a loss function can be seen as a preconditioned gradient descent where updates in the parameter space are driven by a functional perspective. In a spirit similar to Newton's method, a NGD step uses, instead of the Hessian, the Gram matrix of the generating system of the tangent space to the approximation manifold at the current iterate, with respect to a suitable metric. This corresponds to a locally optimal update in function space, following a projected gradient onto the tangent space to the manifold. Still, both gradient and natural gradient descent methods get stuck in local minima. Furthermore, when the model class is a nonlinear manifold or the loss function is not ideally conditioned (e.g., the KL-divergence for density estimation, or a norm of the residual of a partial differential equation in physics informed learning), even the natural gradient might yield non-optimal directions at each step. This work introduces a natural version of classical inertial dynamic methods like Heavy-Ball or Nesterov and show how it can improve the learning process when working with nonlinear model classes.
22.3NAApr 14
Random sketching of operators with application to learning preconditionersOleg Balabanov, Anthony Nouy, Alexandre Pasco
We propose a new random sketching approach for embedding high-dimensional Hilbert-Schmidt operators, using random input-output pairs. Such operator can then be approximated in a low-dimensional subspace of operators by solving a small least-squares problem. To achieve computational efficiency, we introduce a structured random map, composed of three random matrices. We provide rigorous conditions under which subspaces of operators are accurately embedded with high probability. The framework is flexible, as the random matrices may be adapted to the operator structure and the computational environment. As an application, we consider the construction of preconditioners for high-dimensional linear equations. We derive a rigorous characterization of preconditioner quality through the discrepancy between the preconditioned operator and an optimal baseline, which can be tailored to a linear approximation space for the solution. We show that this quantity can be efficiently minimized within the proposed framework, especially for parameter separable linear equations. We then establish rigorous high-probability bounds on the quasi-optimality error of the preconditioned Galerkin projection and on the accuracy of a preconditioned residual-based error estimator when the sketch dimensions are sufficiently large. Numerical experiments on an acoustic wave scattering benchmark demonstrate the effectiveness of the method.
NADec 19, 2025
Approximation and learning with compositional tensor trainsMartin Eigel, Charles Miranda, Anthony Nouy et al.
We introduce compositional tensor trains (CTTs) for the approximation of multivariate functions, a class of models obtained by composing low-rank functions in the tensor-train format. This format can encode standard approximation tools, such as (sparse) polynomials, deep neural networks (DNNs) with fixed width, or tensor networks with arbitrary permutation of the inputs, or more general affine coordinate transformations, with similar complexities. This format can be viewed as a DNN with width exponential in the input dimension and structured weights matrices. Compared to DNNs, this format enables controlled compression at the layer level using efficient tensor algebra. On the optimization side, we derive a layerwise algorithm inspired by natural gradient descent, allowing to exploit efficient low-rank tensor algebra. This relies on low-rank estimations of Gram matrices, and tensor structured random sketching. Viewing the format as a discrete dynamical system, we also derive an optimization algorithm inspired by numerical methods in optimal control. Numerical experiments on regression tasks demonstrate the expressivity of the new format and the relevance of the proposed optimization algorithms. Overall, CTTs combine the expressivity of compositional models with the algorithmic efficiency of tensor algebra, offering a scalable alternative to standard deep neural networks.
NAMay 11, 2018
Tensor-based numerical method for stochastic homogenisationQuentin Ayoul-Guilmard, Anthony Nouy, Christophe Binetruy
This paper addresses the complexity reduction of stochastic homogenisation of a class of random materials for a stationary diffusion equation. A cost-efficient approximation of the correctors is built using a method designed to exploit quasi-periodicity. Accuracy and cost reduction are investigated for local perturbations or small transformations of periodic materials as well as for materials with no periodicity but a mesoscopic structure, for which the limitations of the method are shown. Finally, for materials outside the scope of this method, we propose to use the approximation of homogenised quantities as control variates for variance reduction of a more accurate and costly Monte Carlo estimator (using a multi-fidelity Monte Carlo method). The resulting cost reduction is illustrated in a numerical experiment with a control variate from weakly stochastic homogenisation for comparison, and the limits of this variance reduction technique are tested on materials without periodicity or mesoscopic structure.
NADec 21, 2023
Weighted least-squares approximation with determinantal point processes and generalized volume samplingAnthony Nouy, Bertrand Michel
We consider the problem of approximating a function from $L^2$ by an element of a given $m$-dimensional space $V_m$, associated with some feature map $\boldsymbol{\varphi}$, using evaluations of the function at random points $x_1, \dots,x_n$. After recalling some results on optimal weighted least-squares using independent and identically distributed points, we consider weighted least-squares using projection determinantal point processes (DPP) or volume sampling. These distributions introduce dependence between the points that promotes diversity in the selected features $\boldsymbol{\varphi}(x_i)$. We first provide a generalized version of volume-rescaled sampling yielding quasi-optimality results in expectation with a number of samples $n = O(m\log(m))$, that means that the expected $L^2$ error is bounded by a constant times the best approximation error in $L^2$. Also, further assuming that the function is in some normed vector space $H$ continuously embedded in $L^2$, we further prove that the approximation error in $L^2$ is almost surely bounded by the best approximation error measured in the $H$-norm. This includes the cases of functions from $L^\infty$ or reproducing kernel Hilbert spaces. Finally, we present an alternative strategy consisting in using independent repetitions of projection DPP (or volume sampling), yielding similar error bounds as with i.i.d. or volume sampling, but in practice with a much lower number of samples. Numerical experiments illustrate the performance of the different strategies.
NAMay 3, 2025
Surrogate to Poincaré inequalities on manifolds for dimension reduction in nonlinear feature spacesAnthony Nouy, Alexandre Pasco
We aim to approximate a continuously differentiable function $u:\mathbb{R}^d \rightarrow \mathbb{R}$ by a composition of functions $f\circ g$ where $g:\mathbb{R}^d \rightarrow \mathbb{R}^m$, $m\leq d$, and $f : \mathbb{R}^m \rightarrow \mathbb{R}$ are built in a two stage procedure. For a fixed $g$, we build $f$ using classical regression methods, involving evaluations of $u$. Recent works proposed to build a nonlinear $g$ by minimizing a loss function $\mathcal{J}(g)$ derived from Poincaré inequalities on manifolds, involving evaluations of the gradient of $u$. A problem is that minimizing $\mathcal{J}$ may be a challenging task. Hence in this work, we introduce new convex surrogates to $\mathcal{J}$. Leveraging concentration inequalities, we provide sub-optimality results for a class of functions $g$, including polynomials, and a wide class of input probability measures. We investigate performances on different benchmarks for various training sample sizes. We show that our approach outperforms standard iterative methods for minimizing the training Poincaré inequality based loss, often resulting in better approximation errors, especially for rather small training sets and $m=1$.
FAJan 28, 2021
Approximation Theory of Tree Tensor Networks: Tensorized Multivariate FunctionsMazen Ali, Anthony Nouy
We study the approximation of multivariate functions with tensor networks (TNs), providing some answers to the following two questions: ``what are the approximation capabilities of TNs for functions from classical smoothness classes?'' and ``what are the properties of the class of functions that can be approximated with TNs with a certain performance?'' As a partial answer to the former, we show that TNs can (near to) optimally replicate $h$-uniform and $h$-adaptive spline approximation, for any smoothness order of the target function. Tensor networks thus exhibit universal expressivity w.r.t. isotropic, anisotropic and mixed smoothness spaces that is comparable with more general neural networks families such as deep rectified linear unit (ReLU) networks. Put differently, TNs have the capacity to (near to) optimally approximate many function classes -- without being adapted to the particular class in question. As a partial answer to the latter, as a candidate model class we consider approximation classes of TNs and show that these are (quasi-)Banach spaces, that many types of classical smoothness spaces are continuously embedded into said approximation classes and that TNs approximation classes are themselves not embedded in any classical smoothness space. In other words, TNs can efficiently approximate functions that lie beyond classical smoothness spaces.
FAJul 30, 2020
Approximation of Smoothness Classes by Deep Rectifier NetworksMazen Ali, Anthony Nouy
We consider approximation rates of sparsely connected deep rectified linear unit (ReLU) and rectified power unit (RePU) neural networks for functions in Besov spaces $B^α_{q}(L^p)$ in arbitrary dimension $d$, on general domains. We show that \alert{deep rectifier} networks with a fixed activation function attain optimal or near to optimal approximation rates for functions in the Besov space $B^α_τ(L^τ)$ on the critical embedding line $1/τ=α/d+1/p$ for \emph{arbitrary} smoothness order $α>0$. Using interpolation theory, this implies that the entire range of smoothness classes at or above the critical line is (near to) optimally approximated by deep ReLU/RePU networks.
OCJul 30, 2020
A PAC algorithm in relative precision for bandit problem with costly samplingMarie Billaud-Friess, Arthur Macherey, Anthony Nouy et al.
This paper considers the problem of maximizing an expectation function over a finite set, or finite-arm bandit problem. We first propose a naive stochastic bandit algorithm for obtaining a probably approximately correct (PAC) solution to this discrete optimization problem in relative precision, that is a solution which solves the optimization problem up to a relative error smaller than a prescribed tolerance, with high probability. We also propose an adaptive stochastic bandit algorithm which provides a PAC-solution with the same guarantees. The adaptive algorithm outperforms the mean complexity of the naive algorithm in terms of number of generated samples and is particularly well suited for applications with high sampling cost.
STJul 2, 2020
Learning with tree tensor networks: complexity estimates and model selectionBertrand Michel, Anthony Nouy
Tree tensor networks, or tree-based tensor formats, are prominent model classes for the approximation of high-dimensional functions in computational and data science. They correspond to sum-product neural networks with a sparse connectivity associated with a dimension tree and widths given by a tuple of tensor ranks. The approximation power of these models has been proved to be (near to) optimal for classical smoothness classes. However, in an empirical risk minimization framework with a limited number of observations, the dimension tree and ranks should be selected carefully to balance estimation and approximation errors. We propose and analyze a complexity-based model selection method for tree tensor networks in an empirical risk minimization framework and we analyze its performance over a wide range of smoothness classes. Given a family of model classes associated with different trees, ranks, tensor product feature spaces and sparsity patterns for sparse tensor networks, a model is selected (à la Barron, Birgé, Massart) by minimizing a penalized empirical risk, with a penalty depending on the complexity of the model class and derived from estimates of the metric entropy of tree tensor networks. This choice of penalty yields a risk bound for the selected predictor. In a least-squares setting, after deriving fast rates of convergence of the risk, we show that our strategy is (near to) minimax adaptive to a wide range of smoothness classes including Sobolev or Besov spaces (with isotropic, anisotropic or mixed dominating smoothness) and analytic functions. We discuss the role of sparsity of the tensor network for obtaining optimal performance in several regimes. In practice, the amplitude of the penalty is calibrated with a slope heuristics method. Numerical experiments in a least-squares regression setting illustrate the performance of the strategy.
FAJun 30, 2020
Approximation Theory of Tree Tensor Networks: Tensorized Univariate Functions -- Part IIMazen Ali, Anthony Nouy
We study the approximation by tensor networks (TNs) of functions from classical smoothness classes. The considered approximation tool combines a tensorization of functions in $L^p([0,1))$, which allows to identify a univariate function with a multivariate function (or tensor), and the use of tree tensor networks (the tensor train format) for exploiting low-rank structures of multivariate functions. The resulting tool can be interpreted as a feed-forward neural network, with first layers implementing the tensorization, interpreted as a particular featuring step, followed by a sum-product network with sparse architecture. In part I of this work, we presented several approximation classes associated with different measures of complexity of tensor networks and studied their properties. In this work (part II), we show how classical approximation tools, such as polynomials or splines (with fixed or free knots), can be encoded as a tensor network with controlled complexity. We use this to derive direct (Jackson) inequalities for the approximation spaces of tensor networks. This is then utilized to show that Besov spaces are continuously embedded into these approximation spaces. In other words, we show that arbitrary Besov functions can be approximated with optimal or near to optimal rate. We also show that an arbitrary function in the approximation class possesses no Besov smoothness, unless one limits the depth of the tensor network.
FAJun 30, 2020
Approximation Theory of Tree Tensor Networks: Tensorized Univariate Functions -- Part IMazen Ali, Anthony Nouy
We study the approximation of functions by tensor networks (TNs). We show that Lebesgue $L^p$-spaces in one dimension can be identified with tensor product spaces of arbitrary order through tensorization. We use this tensor product structure to define subsets of $L^p$ of rank-structured functions of finite representation complexity. These subsets are then used to define different approximation classes of tensor networks, associated with different measures of complexity. These approximation classes are shown to be quasi-normed linear spaces. We study some elementary properties and relationships of said spaces. In part II of this work, we will show that classical smoothness (Besov) spaces are continuously embedded into these approximation classes. We will also show that functions in these approximation classes do not possess any Besov smoothness, unless one restricts the depth of the tensor networks. The results of this work are both an analysis of the approximation spaces of TNs and a study of the expressivity of a particular type of neural networks (NN) -- namely feed-forward sum-product networks with sparse architecture. The input variables of this network result from the tensorization step, interpreted as a particular featuring step which can also be implemented with a neural network with a specific architecture. We point out interesting parallels to recent results on the expressivity of rectified linear unit (ReLU) networks -- currently one of the most popular type of NNs.
MLDec 17, 2019
Learning high-dimensional probability distributions using tree tensor networksErwan Grelier, Anthony Nouy, Régis Lebrun
We consider the problem of the estimation of a high-dimensional probability distribution from i.i.d. samples of the distribution using model classes of functions in tree-based tensor formats, a particular case of tensor networks associated with a dimension partition tree. The distribution is assumed to admit a density with respect to a product measure, possibly discrete for handling the case of discrete random variables. After discussing the representation of classical model classes in tree-based tensor formats, we present learning algorithms based on empirical risk minimization using a $L^2$ contrast. These algorithms exploit the multilinear parametrization of the formats to recast the nonlinear minimization problem into a sequence of empirical risk minimization problems with linear models. A suitable parametrization of the tensor in tree-based tensor format allows to obtain a linear model with orthogonal bases, so that each problem admits an explicit expression of the solution and cross-validation risk estimates. These estimations of the risk enable the model selection, for instance when exploiting sparsity in the coefficients of the representation. A strategy for the adaptation of the tensor format (dimension tree and tree-based ranks) is provided, which allows to discover and exploit some specific structures of high-dimensional probability distributions such as independence or conditional independence. We illustrate the performances of the proposed algorithms for the approximation of classical probabilistic models (such as Gaussian distribution, graphical models, Markov chain).
NAMay 14, 2019
Stochastic methods for solving high-dimensional partial differential equationsMarie Billaud-Friess, Arthur Macherey, Anthony Nouy et al.
We propose algorithms for solving high-dimensional Partial Differential Equations (PDEs) that combine a probabilistic interpretation of PDEs, through Feynman-Kac representation, with sparse interpolation. Monte-Carlo methods and time-integration schemes are used to estimate pointwise evaluations of the solution of a PDE. We use a sequential control variates algorithm, where control variates are constructed based on successive approximations of the solution of the PDE. Two different algorithms are proposed, combining in different ways the sequential control variates algorithm and adaptive sparse interpolation. Numerical examples will illustrate the behavior of these algorithms.
MLNov 11, 2018
Learning with tree-based tensor formatsErwan Grelier, Anthony Nouy, Mathilde Chevreuil
This paper is concerned with the approximation of high-dimensional functions in a statistical learning setting, by empirical risk minimization over model classes of functions in tree-based tensor format. These are particular classes of rank-structured functions that can be seen as deep neural networks with a sparse architecture related to the tree and multilinear activation functions. For learning in a given model class, we exploit the fact that tree-based tensor formats are multilinear models and recast the problem of risk minimization over a nonlinear set into a succession of learning problems with linear models. Suitable changes of representation yield numerically stable learning problems and allow to exploit sparsity. For high-dimensional problems or when only a small data set is available, the selection of a good model class is a critical issue. For a given tree, the selection of the tuple of tree-based ranks that minimize the risk is a combinatorial problem. Here, we propose a rank adaptation strategy which provides in practice a good convergence of the risk as a function of the model class complexity. Finding a good tree is also a combinatorial problem, which can be related to the choice of a particular sparse architecture for deep neural networks. Here, we propose a stochastic algorithm for minimizing the complexity of the representation of a given function over a class of trees with a given arity, allowing changes in the topology of the tree. This tree optimization algorithm is then included in a learning scheme that successively adapts the tree and the corresponding tree-based ranks. Contrary to classical learning algorithms for nonlinear model classes, the proposed algorithms are numerically stable, reliable, and require only a low level expertise of the user.
NAOct 10, 2018
Low-rank approximation of linear parabolic equations by space-time tensor Galerkin methodsThomas Boiveau, Virginie Ehrlacher, Alexandre Ern et al.
We devise a space-time tensor method for the low-rank approximation of linear parabolic evolution equations. The proposed method is a stable Galerkin method, uniformly in the discretization parameters, based on a Minimal Residual formulation of the evolution problem in Hilbert--Bochner spaces. The discrete solution is sought in a trial space composed of tensors of discrete functions in space and in time and is characterized as the unique minimizer of a discrete functional where the dual norm of the residual is evaluated in a space semi-discrete test space. The resulting global space-time linear system is solved iteratively by a greedy algorithm. Numerical results are presented to illustrate the performances of the proposed method on test cases including non-selfadjoint and time-dependent differential operators in space. The results are also compared to those obtained using a fully discrete Petrov--Galerkin setting to evaluate the dual residual norm.
NAJun 22, 2015
Geometric Structures in Tensor Representations (Final Release)Antonio Falco, Wolfgang Hackbusch, Anthony Nouy
The main goal of this paper is to study the geometric structures associated with the representation of tensors in subspace based formats. To do this we use a property of the so-called minimal subspaces which allows us to describe the tensor representation by means of a rooted tree. By using the tree structure and the dimensions of the associated minimal subspaces, we introduce, in the underlying algebraic tensor space, the set of tensors in a tree-based format with either bounded or fixed tree-based rank. This class contains the Tucker format and the Hierarchical Tucker format (including the Tensor Train format). In particular, we show that the set of tensors in the tree-based format with bounded (respectively, fixed) tree-based rank of an algebraic tensor product of normed vector spaces is an analytic Banach manifold. Indeed, the manifold geometry for the set of tensors with fixed tree-based rank is induced by a fibre bundle structure and the manifold geometry for the set of tensors with bounded tree-based rank is given by a finite union of connected components. In order to describe the relationship between these manifolds and the natural ambient space, we introduce the definition of topological tensor spaces in the tree-based format. We prove under natural conditions that any tensor of the topological tensor space under consideration admits best approximations in the manifold of tensors in the tree-based format with bounded tree-based rank. In this framework, we also show that the tangent (Banach) space at a given tensor is a complemented subspace in the natural ambient tensor Banach space and hence the set of tensors in the tree-based format with bounded (respectively, fixed) tree-based rank is an immersed submanifold. This fact allows us to extend the Dirac-Frenkel variational principle in the framework of topological tensor spaces.
NAApr 30, 2013
A least-squares method for sparse low rank approximation of multivariate functionsMathilde Chevreuil, Régis Lebrun, Anthony Nouy et al.
In this paper, we propose a low-rank approximation method based on discrete least-squares for the approximation of a multivariate function from random, noisy-free observations. Sparsity inducing regularization techniques are used within classical algorithms for low-rank approximation in order to exploit the possible sparsity of low-rank approximations. Sparse low-rank approximations are constructed with a robust updated greedy algorithm which includes an optimal selection of regularization parameters and approximation ranks using cross validation techniques. Numerical examples demonstrate the capability of approximating functions of many variables even when very few function evaluations are available, thus proving the interest of the proposed algorithm for the propagation of uncertainties through complex computational models.