Gerrit Welper

h-index8

9papers

49citations

Novelty48%

AI Score26

Ranked #161,626 of 194,257 authors (top 83%)#1,016 in NA (top 42%)

9 Papers

1.2NAJan 2, 2016

Adaptive Anisotropic Petrov-Galerkin Methods for First Order Transport Equations

W. Dahmen, G. Kutyniok, W. -Q Lim et al.

This paper builds on recent developments of adaptive methods for linear transport equations based on certain stable variational formulations of Petrov-Galerkin type. The variational formulations allow us to employ meshes with cells of arbitrary aspect ratios. We develop a refinement scheme generating highly anisotropic partitions that is inspired by shearlet systems. We establish approximation rates for N-term approximations from corresponding piecewise polynomials for certain compact cartoon classes of functions. In contrast to earlier results in a curvelet or shearlet context the cartoon classes are concisely defined through certain characteristic parameters and the dependence of the approximation rates on these parameters is made explicit here. The approximation rate results serve then as a benchmark for subsequent applications to adaptive Galerkin solvers for transport equations. In numerical experiments, the new algorithms track C^2-curved shear layers and discontinuities stably and accurately, and realize essentially optimal rates. Finally, we treat parameter dependent transport problems, which arise in kinetic models as well as in radiative transfer. In heterogeneous media these problems feature propagation of singularities along curved characteristics precluding, in particular, fast marching methods based on ray-tracing. Since now the solutions are functions of spatial variables and parameters one has to address the curse of dimensionality. We show computationally, for a model parametric transport problem in heterogeneous media in 2 + 1 dimension, that sparse tensorization of the presently proposed spatial directionally adaptive scheme with hierarchic collocation in ordinate space based on a stable variational formulation high-dimensional phase space, the curse of dimensionality can be removed when approximating averaged bulk quantities.

2.3NASep 28, 2014

Efficient Resolution of Anisotropic Structures

Wolfgang Dahmen, Chunyan Huang, Gitta Kutyniok et al.

We highlight some recent new delevelopments concerning the sparse representation of possibly high-dimensional functions exhibiting strong anisotropic features and low regularity in isotropic Sobolev or Besov scales. Specifically, we focus on the solution of transport equations which exhibit propagation of singularities where, additionally, high-dimensionality enters when the convection field, and hence the solutions, depend on parameters varying over some compact set. Important constituents of our approach are directionally adaptive discretization concepts motivated by compactly supported shearlet systems, and well-conditioned stable variational formulations that support trial spaces with anisotropic refinements with arbitrary directionalities. We prove that they provide tight error-residual relations which are used to contrive rigorously founded adaptive refinement schemes which converge in $L_2$. Moreover, in the context of parameter dependent problems we discuss two approaches serving different purposes and working under different regularity assumptions. For frequent query problems, making essential use of the novel well-conditioned variational formulations, a new Reduced Basis Method is outlined which exhibits a certain rate-optimal performance for indefinite, unsymmetric or singularly perturbed problems. For the radiative transfer problem with scattering a sparse tensor method is presented which mitigates or even overcomes the curse of dimensionality under suitable (so far still isotropic) regularity assumptions. Numerical examples for both methods illustrate the theoretical findings.

1.2NAOct 31, 2017

$h$ and $hp$-adaptive Interpolation by Transformed Snapshots for Parametric and Stochastic Hyperbolic PDEs

G. Welper

The numerical approximation of solutions of parametric or stochastic hyperbolic PDEs is still a serious challenge. Because of shock singularities, most methods from the elliptic and parabolic regime, such as reduced basis methods, POD or polynomial chaos expansions, show a poor performance. Recently, Welper [Interpolation of functions with parameter dependent jumps by transformed snapshots. SIAM Journal on Scientific Computing, 39(4):A1225-A1250, 2017] introduced a new approximation method, based on the alignment of the jump sets of the snapshots. If the structure of the jump sets changes with parameter, this assumption is too restrictive. However, these changes are typically local in parameter space, so that in this paper, we explore $h$ and $hp$-adaptive methods to resolve them. Since local refinements do not scale to high dimensions, we introduce an alternative "tensorized" adaption method.

5.3LGSep 9, 2023

Approximation Results for Gradient Descent trained Neural Networks

G. Welper

The paper contains approximation guarantees for neural networks that are trained with gradient flow, with error measured in the continuous $L_2(\mathbb{S}^{d-1})$-norm on the $d$-dimensional unit sphere and targets that are Sobolev smooth. The networks are fully connected of constant depth and increasing width. Although all layers are trained, the gradient flow convergence is based on a neural tangent kernel (NTK) argument for the non-convex second but last layer. Unlike standard NTK analysis, the continuous error norm implies an under-parametrized regime, possible by the natural smoothness assumption required for approximation. The typical over-parametrization re-enters the results in form of a loss in approximation rate relative to established approximation methods for Sobolev smooth functions.

2.6LGMay 19, 2024

Approximation and Gradient Descent Training with Neural Networks

G. Welper

It is well understood that neural networks with carefully hand-picked weights provide powerful function approximation and that they can be successfully trained in over-parametrized regimes. Since over-parametrization ensures zero training error, these two theories are not immediately compatible. Recent work uses the smoothness that is required for approximation results to extend a neural tangent kernel (NTK) optimization argument to an under-parametrized regime and show direct approximation bounds for networks trained by gradient flow. Since gradient flow is only an idealization of a practical method, this paper establishes analogous results for networks trained by gradient descent.

2.3NAJun 9, 2024

A Low Rank Neural Representation of Entropy Solutions

Donsub Rim, Gerrit Welper

We construct a new representation of entropy solutions to nonlinear scalar conservation laws with a smooth convex flux function in a single spatial dimension. The representation is a generalization of the method of characteristics and posseses a compositional form. While it is a nonlinear representation, the embedded dynamics of the solution in the time variable is linear. This representation is then discretized as a manifold of implicit neural representations where the feedforward neural network architecture has a low rank structure. Finally, we show that the low rank neural representation with a fixed number of layers and a small number of coefficients can approximate any entropy solution regardless of the complexity of the shock topology, while retaining the linearity of the embedded dynamics.

3.1LGJan 20, 2021

Non-Convex Compressed Sensing with Training Data

G. Welper

Efficient algorithms for the sparse solution of under-determined linear systems $Ax = b$ are known for matrices $A$ satisfying suitable assumptions like the restricted isometry property (RIP). Without such assumptions little is known and without any assumptions on $A$ the problem is $NP$-hard. A common approach is to replace $\ell_1$ by $\ell_p$ minimization for $0 < p < 1$, which is no longer convex and typically requires some form of local initial values for provably convergent algorithms. In this paper, we consider an alternative, where instead of suitable initial values we are provided with extra training problems $Ax = B_l$, $l=1, \dots, p$ that are related to our compressed sensing problem. They allow us to find the solution of the original problem $Ax = b$ with high probability in the range of a one layer linear neural network with comparatively few assumptions on the matrix $A$.

1.2LGJul 27, 2020

Universality of Gradient Descent Neural Network Training

G. Welper

It has been observed that design choices of neural networks are often crucial for their successful optimization. In this article, we therefore discuss the question if it is always possible to redesign a neural network so that it trains well with gradient descent. This yields the following universality result: If, for a given network, there is any algorithm that can find good network weights for a classification task, then there exists an extension of this network that reproduces these weights and the corresponding forward output by mere gradient descent training. The construction is not intended for practical computations, but it provides some orientation on the possibilities of meta-learning and related approaches.

1.2NAMay 6, 2015

Transformed snapshot interpolation

G. Welper

Functions with jumps and kinks typically arising from parameter dependent or stochastic hyperbolic PDEs are notoriously difficult to approximate. If the jump location in physical space is parameter dependent or random, standard approximation techniques like reduced basis methods, PODs, polynomial chaos, etc. are known to yield poor convergence rates. In order to improve these rates, we propose a new approximation scheme. As reduced basis methods, it relies on snapshots for the reconstruction of parameter dependent functions so that it is efficiently applicable in a PDE context. However, we allow a transformation of the physical coordinates before the use of a snapshot in the reconstruction, which allows to realign the moving discontinuities and yields high convergence rates. The transforms are automatically computed by minimizing a training error. In order to show feasibility of this approach it is tested by 1d and 2d numerical experiments.