NANov 11, 2025
A Neural-Operator Preconditioned Newton Method for Accelerated Nonlinear SolversYoungkyu Lee, Shanqing Liu, Jerome Darbon et al.
We propose a novel neural preconditioned Newton (NP-Newton) method for solving parametric nonlinear systems of equations. To overcome the stagnation or instability of Newton iterations caused by unbalanced nonlinearities, we introduce a fixed-point neural operator (FPNO) that learns the direct mapping from the current iterate to the solution by emulating fixed-point iterations. Unlike traditional line-search or trust-region algorithms, the proposed FPNO adaptively employs negative step sizes to effectively mitigate the effects of unbalanced nonlinearities. Through numerical experiments we demonstrate the computational efficiency and robustness of the proposed NP-Newton method across multiple real-world applications, especially for very strong nonlinearities.
LGMay 20, 2024
Fast meta-solvers for 3D complex-shape scatterers using neural operators trained on a non-scattering problemYoungkyu Lee, Shanqing Liu, Zongren Zou et al.
Three-dimensional target identification using scattering techniques requires high accuracy solutions and very fast computations for real-time predictions in some critical applications. We first train a deep neural operator~(DeepONet) to solve wave propagation problems described by the Helmholtz equation in a domain \textit{without scatterers} but at different wavenumbers and with a complex absorbing boundary condition. We then design two classes of fast meta-solvers by combining DeepONet with either relaxation methods, such as Jacobi and Gauss-Seidel, or with Krylov methods, such as GMRES and BiCGStab, using the trunk basis of DeepONet as a coarse-scale preconditioner. We leverage the spectral bias of neural networks to account for the lower part of the spectrum in the error distribution while the upper part is handled inexpensively using relaxation methods or fine-scale preconditioners. The meta-solvers are then applied to solve scattering problems with different shape of scatterers, at no extra training cost. We first demonstrate that the resulting meta-solvers are shape-agnostic, fast, and robust, whereas the standard standalone solvers may even fail to converge without the DeepONet. We then apply both classes of meta-solvers to scattering from a submarine, a complex three-dimensional problem. We achieve very fast solutions, especially with the DeepONet-Krylov methods, which require orders of magnitude fewer iterations than any of the standalone solvers.
5.3OCApr 6
PINNs in PDE Constrained Optimal Control Problems: Direct vs Indirect MethodsZhen Zhang, Shanqing Liu, Alessandro Alla et al.
We study physics-informed neural networks (PINNs) as numerical tools for the optimal control of semilinear partial differential equations. We first recall the classical direct and indirect viewpoints for optimal control of PDEs, and then present two PINN formulations: a direct formulation based on minimizing the objective under the state constraint, and an indirect formulation based on the first-order optimality system. For a class of semilinear parabolic equations, we derive the state equation, the adjoint equation, and the stationarity condition in a form consistent with continuous-time Pontryagin-type optimality conditions. We then specialize the framework to an Allen-Cahn control problem and compare three numerical approaches: (i) a discretize-then-optimize adjoint method, (ii) a direct PINN, and (iii) an indirect PINN. Numerical results show that the PINN parameterization has an implicit regularizing effect, in the sense that it tends to produce smoother control profiles. They also indicate that the indirect PINN more faithfully preserves the PDE contraint and optimality structure and yields a more accurate neural approximation than the direct PINN.
LGJan 21, 2025
Automatic selection of the best neural architecture for time series forecasting via multi-objective optimization and Pareto optimality conditionsQianying Cao, Shanqing Liu, Alan John Varghese et al.
Time series forecasting plays a pivotal role in a wide range of applications, including weather prediction, healthcare, structural health monitoring, predictive maintenance, energy systems, and financial markets. While models such as LSTM, GRU, Transformers, and State-Space Models (SSMs) have become standard tools in this domain, selecting the optimal architecture remains a challenge. Performance comparisons often depend on evaluation metrics and the datasets under analysis, making the choice of a universally optimal model controversial. In this work, we introduce a flexible automated framework for time series forecasting that systematically designs and evaluates diverse network architectures by integrating LSTM, GRU, multi-head Attention, and SSM blocks. Using a multi-objective optimization approach, our framework determines the number, sequence, and combination of blocks to align with specific requirements and evaluation objectives. From the resulting Pareto-optimal architectures, the best model for a given context is selected via a user-defined preference function. We validate our framework across four distinct real-world applications. Results show that a single-layer GRU or LSTM is usually optimal when minimizing training time alone. However, when maximizing accuracy or balancing multiple objectives, the best architectures are often composite designs incorporating multiple block types in specific configurations. By employing a weighted preference function, users can resolve trade-offs between objectives, revealing novel, context-specific optimal architectures. Our findings underscore that no single neural architecture is universally optimal for time series forecasting. Instead, the best-performing model emerges as a data-driven composite architecture tailored to user-defined criteria and evaluation objectives.