Gonçalo dos Reis

h-index18

4papers

29citations

Novelty45%

AI Score25

Ranked #164,072 of 194,257 authors (top 84%)#27 in PR (top 57%)

4 Papers

1.2NAMay 10, 2017

Hybrid PDE solver for data-driven problems and modern branching

Francisco Bernal, Gonçalo dos Reis, Greig Smith

The numerical solution of large-scale PDEs, such as those occurring in data-driven applications, unavoidably require powerful parallel computers and tailored parallel algorithms to make the best possible use of them. In fact, considerations about the parallelization and scalability of realistic problems are often critical enough to warrant acknowledgement in the modelling phase. The purpose of this paper is to spread awareness of the Probabilistic Domain Decomposition (PDD) method, a fresh approach to the parallelization of PDEs with excellent scalability properties. The idea exploits the stochastic representation of the PDE and its approximation via Monte Carlo in combination with deterministic high-performance PDE solvers. We describe the ingredients of PDD and its applicability in the scope of data science. In particular, we highlight recent advances in stochastic representations for nonlinear PDEs using branching diffusions, which have significantly broadened the scope of PDD. We envision this work as a dictionary giving large-scale PDE practitioners references on the very latest algorithms and techniques of a non-standard, yet highly parallelizable, methodology at the interface of deterministic and probabilistic numerical methods. We close this work with an invitation to the fully nonlinear case and open research questions.

2.6LGAug 16, 2024

A Mean Field Ansatz for Zero-Shot Weight Transfer

Xingyuan Chen, Wenwei Kuang, Lei Deng et al.

The pre-training cost of large language models (LLMs) is prohibitive. One cutting-edge approach to reduce the cost is zero-shot weight transfer, also known as model growth for some cases, which magically transfers the weights trained in a small model to a large model. However, there are still some theoretical mysteries behind the weight transfer. In this paper, inspired by prior applications of mean field theory to neural network dynamics, we introduce a mean field ansatz to provide a theoretical explanation for weight transfer. Specifically, we propose the row-column (RC) ansatz under the mean field point of view, which describes the measure structure of the weights in the neural network (NN) and admits a close measure dynamic. Thus, the weights of different sizes NN admit a common distribution under proper assumptions, and weight transfer methods can be viewed as sampling methods. We empirically validate the RC ansatz by exploring simple MLP examples and LLMs such as GPT-3 and Llama-3.1. We show the mean-field point of view is adequate under suitable assumptions which can provide theoretical support for zero-shot weight transfer.

1.2PRApr 29, 2019

An unbiased Ito type stochastic representation for transport PDEs: A Toy Example

Goncalo dos Reis, Greig Smith

We propose a stochastic representation for a simple class of transport PDEs based on Ito representations. We detail an algorithm using an estimator stemming for the representation that, unlike regularization by noise estimators, is unbiased. We rely on recent developments on branching diffusions, regime switching processes and their representations of PDEs. There is a loose relation between our technique and regularization by noise, but contrary to the latter, we add a perturbation and immediately its correction. The method is only possible through a judicious choice of the diffusion coefficient $σ$. A key feature is that our approach does not rely on the smallness of $σ$, in fact, our $σ$ is strictly bounded from below which is in stark contrast with standard perturbation techniques. This is critical for extending this method to non-toy PDEs which have nonlinear terms in the first derivative where the usual perturbation technique breaks down. The examples presented show the algorithm outperforming alternative approaches. Moreover, the examples point toward a potential algorithm for the fully nonlinear case where the method of characteristics breaks down.

1.2PRJul 22, 2016

Convergence and qualitative properties of modified explicit schemes for BSDEs with polynomial growth

Arnaud Lionnet, Gonçalo dos Reis, Lukasz Szpruch

The theory of Forward-Backward Stochastic Differential Equations (FBSDEs) paves a way to probabilistic numerical methods for nonlinear parabolic PDEs. The majority of the results on the numerical methods for FBSDEs relies on the global Lipschitz assumption, which is not satisfied for a number of important cases such as the Fisher--KPP or the FitzHugh--Nagumo equations. Furthermore, it has been shown in \cite{LionnetReisSzpruch2015} that for BSDEs with monotone drivers having polynomial growth in the primary variable $y$, only the (sufficiently) implicit schemes converge. But these require an additional computational effort compared to explicit schemes. This article develops a general framework that allows the analysis, in a systematic fashion, of the integrability properties, convergence and qualitative properties (e.g.~comparison theorem) for whole families of modified explicit schemes. The framework yields the convergence of some modified explicit scheme with the same rate as implicit schemes and with the computational cost of the standard explicit scheme. To illustrate our theory, we present several classes of easily implementable modified explicit schemes that can computationally outperform the implicit one and preserve the qualitative properties of the solution to the BSDE. These classes fit into our developed framework and are tested in computational experiments.