Luke N. Olson

h-index25

12papers

197citations

Novelty49%

AI Score33

Ranked #115,842 of 194,257 authors (top 60%)#25,464 in LG (top 63%)

12 Papers

1.2DCDec 15, 2015

Reducing Parallel Communication in Algebraic Multigrid through Sparsification

Amanda Bienz, Robert D. Falgout William Gropp, Luke N. Olson et al.

Algebraic multigrid (AMG) is an $\mathcal{O}(n)$ solution process for many large sparse linear systems. A hierarchy of progressively coarser grids is constructed that utilize complementary relaxation and interpolation operators. High-energy error is reduced by relaxation, while low-energy error is mapped to coarse-grids and reduced there. However, large parallel communication costs often limit parallel scalability. As the multigrid hierarchy is formed, each coarse matrix is formed through a triple matrix product. The resulting coarse-grids often have significantly more nonzeros per row than the original fine-grid operator, thereby generating high parallel communication costs on coarse-levels. In this paper, we introduce a method that systematically removes entries in coarse-grid matrices after the hierarchy is formed, leading to an improved communication costs. We sparsify by removing weakly connected or unimportant entries in the matrix, leading to improved solve time. The main trade-off is that if the heuristic identifying unimportant entries is used too aggressively, then AMG convergence can suffer. To counteract this, the original hierarchy is retained, allowing entries to be reintroduced into the solver hierarchy if convergence is too slow. This enables a balance between communication cost and convergence, as necessary. In this paper we present new algorithms for reducing communication and present a number of computational experiments in support.

13.6LGMay 19, 2022Code

Learning Interface Conditions in Domain Decomposition Solvers

Ali Taghibakhshi, Nicolas Nytko, Tareq Zaman et al.

Domain decomposition methods are widely used and effective in the approximation of solutions to partial differential equations. Yet the optimal construction of these methods requires tedious analysis and is often available only in simplified, structured-grid settings, limiting their use for more complex problems. In this work, we generalize optimized Schwarz domain decomposition methods to unstructured-grid problems, using Graph Convolutional Neural Networks (GCNNs) and unsupervised learning to learn optimal modifications at subdomain interfaces. A key ingredient in our approach is an improved loss function, enabling effective training on relatively small problems, but robust performance on arbitrarily large problems, with computational cost linear in problem size. The performance of the learned linear solvers is compared with both classical and optimized domain decomposition algorithms, for both structured- and unstructured-grid problems.

13.7LGJan 26, 2023Code

MG-GNN: Multigrid Graph Neural Networks for Learning Multilevel Domain Decomposition Methods

Ali Taghibakhshi, Nicolas Nytko, Tareq Uz Zaman et al.

Domain decomposition methods (DDMs) are popular solvers for discretized systems of partial differential equations (PDEs), with one-level and multilevel variants. These solvers rely on several algorithmic and mathematical parameters, prescribing overlap, subdomain boundary conditions, and other properties of the DDM. While some work has been done on optimizing these parameters, it has mostly focused on the one-level setting or special cases such as structured-grid discretizations with regular subdomain construction. In this paper, we propose multigrid graph neural networks (MG-GNN), a novel GNN architecture for learning optimized parameters in two-level DDMs\@. We train MG-GNN using a new unsupervised loss function, enabling effective training on small problems that yields robust performance on unstructured grids that are orders of magnitude larger than those in the training set. We show that MG-GNN outperforms popular hierarchical graph network architectures for this optimization and that our proposed loss function is critical to achieving this improved performance.

7.8LGDec 10, 2022Code

Optimized Sparse Matrix Operations for Reverse Mode Automatic Differentiation

Nicolas Nytko, Ali Taghibakhshi, Tareq Uz Zaman et al.

Sparse matrix representations are ubiquitous in computational science and machine learning, leading to significant reductions in compute time, in comparison to dense representation, for problems that have local connectivity. The adoption of sparse representation in leading ML frameworks such as PyTorch is incomplete, however, with support for both automatic differentiation and GPU acceleration missing. In this work, we present an implementation of a CSR-based sparse matrix wrapper for PyTorch with CUDA acceleration for basic matrix operations, as well as automatic differentiability. We also present several applications of the resulting sparse kernels to optimization problems, demonstrating ease of implementation and performance measurements versus their dense counterparts.

1.2NAAug 5, 2022

Parallel Energy-Minimization Prolongation for Algebraic Multigrid

Carlo Janna, Andrea Franceschini, Jacob B. Schroder et al.

Algebraic multigrid (AMG) is one of the most widely used solution techniques for linear systems of equations arising from discretized partial differential equations. The popularity of AMG stems from its potential to solve linear systems in almost linear time, that is with an O(n) complexity, where n is the problem size. This capability is crucial at the present, where the increasing availability of massive HPC platforms pushes for the solution of very large problems. The key for a rapidly converging AMG method is a good interplay between the smoother and the coarse-grid correction, which in turn requires the use of an effective prolongation. From a theoretical viewpoint, the prolongation must accurately represent near kernel components and, at the same time, be bounded in the energy norm. For challenging problems, however, ensuring both these requirements is not easy and is exactly the goal of this work. We propose a constrained minimization procedure aimed at reducing prolongation energy while preserving the near kernel components in the span of interpolation. The proposed algorithm is based on previous energy minimization approaches utilizing a preconditioned restricted conjugate gradients method, but has new features and a specific focus on parallel performance and implementation. It is shown that the resulting solver, when used for large real-world problems from various application fields, exhibits excellent convergence rates and scalability and outperforms at least some more traditional AMG approaches.

6.4LGAug 6, 2024Code

A TVD neural network closure and application to turbulent combustion

Seung Won Suh, Jonathan F MacArt, Luke N Olson et al.

Trained neural networks (NN) have attractive features for closing governing equations. There are many methods that are showing promise, but all can fail in cases when small errors consequentially violate physical reality, such as a solution boundedness condition. A NN formulation is introduced to preclude spurious oscillations that violate solution boundedness or positivity. It is embedded in the discretized equations as a machine learning closure and strictly constrained, inspired by total variation diminishing (TVD) methods for hyperbolic conservation laws. The constraint is exactly enforced during gradient-descent training by rescaling the NN parameters, which maps them onto an explicit feasible set. Demonstrations show that the constrained NN closure model usefully recovers linear and nonlinear hyperbolic phenomena and anti-diffusion while enforcing the non-oscillatory property. Finally, the model is applied to subgrid-scale (SGS) modeling of a turbulent reacting flow, for which it suppresses spurious oscillations in scalar fields that otherwise violate the solution boundedness. It outperforms a simple penalization of oscillations in the loss function.

1.2MSMar 6, 2018

Scaling Structured Multigrid to 500K+ Cores through Coarse-Grid Redistribution

Andrew Reisner, Luke N. Olson, J. David Moulton

The efficient solution of sparse, linear systems resulting from the discretization of partial differential equations is crucial to the performance of many physics-based simulations. The algorithmic optimality of multilevel approaches for common discretizations makes them a good candidate for an efficient parallel solver. Yet, modern architectures for high-performance computing systems continue to challenge the parallel scalability of multilevel solvers. While algebraic multigrid methods are robust for solving a variety of problems, the increasing importance of data locality and cost of data movement in modern architectures motivates the need to carefully exploit structure in the problem. Robust logically structured variational multigrid methods, such as Black Box Multigrid (BoxMG), maintain structure throughout the multigrid hierarchy. This avoids indirection and increased coarse-grid communication costs typical in parallel algebraic multigrid. Nevertheless, the parallel scalability of structured multigrid is challenged by coarse-grid problems where the overhead in communication dominates computation. In this paper, an algorithm is introduced for redistributing coarse-grid problems through incremental agglomeration. Guided by a predictive performance model, this algorithm provides robust redistribution decisions for structured multilevel solvers. A two-dimensional diffusion problem is used to demonstrate the significant gain in performance of this algorithm over the previous approach that used agglomeration to one processor. In addition, the parallel scalability of this approach is demonstrated on two large-scale computing systems, with solves on up to 500K+ cores.

7.7LGMay 27, 2023Code

Learning from Integral Losses in Physics Informed Neural Networks

Ehsan Saleh, Saba Ghaffari, Timothy Bretl et al.

This work proposes a solution for the problem of training physics-informed networks under partial integro-differential equations. These equations require an infinite or a large number of neural evaluations to construct a single residual for training. As a result, accurate evaluation may be impractical, and we show that naive approximations at replacing these integrals with unbiased estimates lead to biased loss functions and solutions. To overcome this bias, we investigate three types of potential solutions: the deterministic sampling approaches, the double-sampling trick, and the delayed target method. We consider three classes of PDEs for benchmarking; one defining Poisson problems with singular charges and weak solutions of up to 10 dimensions, another involving weak solutions on electro-magnetic fields and a Maxwell equation, and a third one defining a Smoluchowski coagulation problem. Our numerical results confirm the existence of the aforementioned bias in practice and also show that our proposed delayed target approach can lead to accurate solutions with comparable quality to ones estimated with a large sample size integral. Our implementation is open-source and available at https://github.com/ehsansaleh/btspinn.

11.9LGJun 3, 2021Code

Optimization-Based Algebraic Multigrid Coarsening Using Reinforcement Learning

Ali Taghibakhshi, Scott MacLachlan, Luke Olson et al.

Large sparse linear systems of equations are ubiquitous in science and engineering, such as those arising from discretizations of partial differential equations. Algebraic multigrid (AMG) methods are one of the most common methods of solving such linear systems, with an extensive body of underlying mathematical theory. A system of linear equations defines a graph on the set of unknowns and each level of a multigrid solver requires the selection of an appropriate coarse graph along with restriction and interpolation operators that map to and from the coarse representation. The efficiency of the multigrid solver depends critically on this selection and many selection methods have been developed over the years. Recently, it has been demonstrated that it is possible to directly learn the AMG interpolation and restriction operators, given a coarse graph selection. In this paper, we consider the complementary problem of learning to coarsen graphs for a multigrid solver, a necessary step in developing fully learnable AMG methods. We propose a method using a reinforcement learning (RL) agent based on graph neural networks (GNNs), which can learn to perform graph coarsening on small planar training graphs and then be applied to unstructured large planar graphs, assuming bounded node degree. We demonstrate that this method can produce better coarse graphs than existing algorithms, even as the graph size increases and other properties of the graph are varied. We also propose an efficient inference procedure for performing graph coarsening that results in linear time complexity in graph size.

1.2DCApr 24, 2019

Reducing Communication in Algebraic Multigrid with Multi-step Node Aware Communication

Amanda Bienz, Luke Olson, William Gropp

Algebraic multigrid (AMG) is often viewed as a scalable $\mathcal{O}(n)$ solver for sparse linear systems. Yet, parallel AMG lacks scalability due to increasingly large costs associated with communication, both in the initial construction of a multigrid hierarchy as well as the iterative solve phase. This work introduces a parallel implementation of AMG to reduce the cost of communication, yielding an increase in scalability. Standard inter-process communication consists of sending data regardless of the send and receive process locations. Performance tests show notable differences in the cost of intra- and inter-node communication, motivating a restructuring of communication. In this case, the communication schedule takes advantage of the less costly intra-node communication, reducing both the number and size of inter-node messages. Node-centric communication extends to the range of components in both the setup and solve phase of AMG, yielding an increase in the weak and strong scalability of the entire method.

2.3PFOct 28, 2018

Learning with Analytical Models

Huda Ibeid, Siping Meng, Oliver Dobon et al.

To understand and predict the performance of scientific applications, several analytical and machine learning approaches have been proposed, each having its advantages and disadvantages. In this paper, we propose and validate a hybrid approach for performance modeling and prediction, which combines analytical and machine learning models. The proposed hybrid model aims to minimize prediction cost while providing reasonable prediction accuracy. Our validation results show that the hybrid model is able to learn and correct the analytical models to better match the actual performance. Furthermore, the proposed hybrid model improves the prediction accuracy in comparison to pure machine learning techniques while using small training datasets, thus making it suitable for hardware and workload changes.

1.2NAMar 30, 2015

A Finite Element Based P3M Method for N-body Problems

Natalie N. Beams, Luke N. Olson, Jonathan B. Freund

We introduce a fast mesh-based method for computing N-body interactions that is both scalable and accurate. The method is founded on a particle-particle--particle-mesh P3M approach, which decomposes a potential into rapidly decaying short-range interactions and smooth, mesh-resolvable long-range interactions. However, in contrast to the traditional approach of using Gaussian screen functions to accomplish this decomposition, our method employs specially designed polynomial bases to construct the screened potentials. Because of this form of the screen, the long-range component of the potential is then solved exactly with a finite element method, leading ultimately to a sparse matrix problem that is solved efficiently with standard multigrid methods. Moreover, since this system represents an exact discretization, the optimal resolution properties of the FFT are unnecessary, though the short-range calculation is now more involved than P3M/PME methods. We introduce the method, analyze its key properties, and demonstrate the accuracy of the algorithm.