APMar 2, 2019
A blob method for diffusionJosé Antonio Carrillo, Katy Craig, Francesco S. Patacchini
As a counterpoint to classical stochastic particle methods for diffusion, we develop a deterministic particle method for linear and nonlinear diffusion. At first glance, deterministic particle methods are incompatible with diffusive partial differential equations since initial data given by sums of Dirac masses would be smoothed instantaneously: particles do not remain particles. Inspired by classical vortex blob methods, we introduce a nonlocal regularization of our velocity field that ensures particles do remain particles, and we apply this to develop a numerical blob method for a range of diffusive partial differential equations of Wasserstein gradient flow type, including the heat equation, the porous medium equation, the Fokker-Planck equation, the Keller-Segel equation, and its variants. Our choice of regularization is guided by the Wasserstein gradient flow structure, and the corresponding energy has a novel form, combining aspects of the well-known interaction and potential energies. In the presence of a confining drift or interaction potential, we prove that minimizers of the regularized energy exist and, as the regularization is removed, converge to the minimizers of the unregularized energy. We then restrict our attention to nonlinear diffusion of porous medium type with at least quadratic exponent. Under sufficient regularity assumptions, we prove that gradient flows of the regularized energies converge to solutions of the porous medium equation. As a corollary, we obtain convergence of our numerical blob method, again under sufficient regularity assumptions. We conclude by considering a range of numerical examples to demonstrate our method's rate of convergence to exact solutions and to illustrate key qualitative properties preserved by the method, including asymptotic behavior of the Fokker-Planck equation and critical mass of the two-dimensional Keller-Segel equation.
NASep 21, 2017
A hybrid mass transport finite element method for Keller-Segel type systemsJosé Antonio Carrillo, Niklas Kolbe, Mária Lukácová-Medviďová
We propose a new splitting scheme for general reaction-taxis-diffusion systems in one spatial dimension capable to deal with simultaneous concentrated and diffusive regions as well as travelling waves and merging phenomena. The splitting scheme is based on a mass transport strategy for the cell density coupled with classical finite element approximations for the rest of the system. The built-in mass adaption of the scheme allows for an excellent performance even with respect to dedicated mesh-adapted AMR schemes in original variables.
LGJan 30, 2025
A Unified Perspective on the Dynamics of Deep TransformersValérie Castin, Pierre Ablin, José Antonio Carrillo et al.
Transformers, which are state-of-the-art in most machine learning tasks, represent the data as sequences of vectors called tokens. This representation is then exploited by the attention function, which learns dependencies between tokens and is key to the success of Transformers. However, the iterative application of attention across layers induces complex dynamics that remain to be fully understood. To analyze these dynamics, we identify each input sequence with a probability measure and model its evolution as a Vlasov equation called Transformer PDE, whose velocity field is non-linear in the probability measure. Our first set of contributions focuses on compactly supported initial data. We show the Transformer PDE is well-posed and is the mean-field limit of an interacting particle system, thus generalizing and extending previous analysis to several variants of self-attention: multi-head attention, L2 attention, Sinkhorn attention, Sigmoid attention, and masked attention--leveraging a conditional Wasserstein framework. In a second set of contributions, we are the first to study non-compactly supported initial conditions, by focusing on Gaussian initial data. Again for different types of attention, we show that the Transformer PDE preserves the space of Gaussian measures, which allows us to analyze the Gaussian case theoretically and numerically to identify typical behaviors. This Gaussian analysis captures the evolution of data anisotropy through a deep Transformer. In particular, we highlight a clustering phenomenon that parallels previous results in the non-normalized discrete case.
NAOct 7, 2018
An entropy stable high-order discontinuous Galerkin method for cross-diffusion gradient flow systemsZheng Sun, José Antonio Carrillo, Chi-Wang Shu
As an extension of our previous work in Sun et.al (2018) [41], we develop a discontinuous Galerkin method for solving cross-diffusion systems with a formal gradient flow structure. These systems are associated with non-increasing entropy functionals. For a class of problems, the positivity (non-negativity) of solutions is also expected, which is implied by the physical model and is crucial to the entropy structure. The semi-discrete numerical scheme we propose is entropy stable. Furthermore, the scheme is also compatible with the positivity-preserving procedure in Zhang (2017) [42] in many scenarios. Hence the resulting fully discrete scheme is able to produce non-negative solutions. The method can be applied to both one-dimensional problems and two-dimensional problems on Cartesian meshes. Numerical examples are given to examine the performance of the method.