Junxiong Jia

NA
3papers
4citations
Novelty45%
AI Score36

3 Papers

NAMay 22, 2016
Variable Total Variation Regularization for Backward Time-Space Fractional Diffusion Problem

Junxiong Jia, Jigen Peng, Jinghuai Gao et al.

In this paper, we consider a backward problem for a time-space fractional diffusion process. For this problem, we propose to construct the initial data by minimizing data residual error in fourier space domain and variable total variation (TV) regularizing term which can protect the edges as TV regularizing term and reduce staircasing effect. The well-posedness of this optimization problem is studied under a very general setting. Actually, we write the time-space fractional diffusion equation as an abstract fractional differential equation and get our results by using fractional semigroup theory, so our results can be applied to other backward problems for more general fractional differential equations. Then a modified Bregman iterative algorithm is proposed to approximate the minimizer. The new features of this algorithm is that the regularizing term changed in each step and we need not to solve the complexed Euler-Lagrange equations of variable TV regularizing term (just need to solve a simpler Euler-Lagrange equations). The convergence of this algorithm and the strategy of choosing parameters are also obtained. Numerical implementations are given to support our analysis to show the flexibility of our minimization model.

NAFeb 20, 2019
Recursive linearization method for inverse medium scattering problems with complex mixture Gaussian error learning

Junxiong Jia, Bangyu Wu, Jigen Peng et al.

This paper is concerned with the modeling errors appeared in the numerical methods of inverse medium scattering problems (IMSP). Optimization based iterative methods are wildly employed to solve IMSP, which are computationally intensive due to a series of Helmholtz equations need to be solved numerically. Hence, rough approximations of Helmholtz equations can significantly speed up the iterative procedure. However, rough approximations will lead to instability and inaccurate estimations. Using the Bayesian inverse methods, we incorporate the modelling errors brought by the rough approximations. Modelling errors are assumed to be some complex Gaussian mixture (CGM) random variables, and in addition, well-posedness of IMSP in the statistical sense has been established by extending the general theory to involve CGM noise. Then, we generalize the real valued expectation-maximization (EM) algorithm used in the machine learning community to our complex valued case to learn parameters in the CGM distribution. Based on these preparations, we generalize the recursive linearization method (RLM) to a new iterative method named as Gaussian mixture recursive linearization method (GMRLM) which takes modelling errors into account. Finally, we provide two numerical examples to illustrate the effectiveness of the proposed method.

87.3LGApr 27
A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws

Jun Shu, Junxiong Jia, Deyu Meng et al.

Emergent intelligence have played a major role in the modern AI development. While existing studies primarily rely on empirical observations to characterize this phenomenon, a rigorous theoretical framework remains underexplored. This study attempts to develop a mathematical approach to formalize emergent intelligence from the perspective of limit theory. Specifically, we introduce a performance function E(N, P, K), dependent on data size N, model size P and training steps K, to quantify intelligence behavior. We posit that intelligence emerges as a transition from finite to effectively infinite knowledge, and thus recast emergent intelligence as existence of the limit $\lim_{N,P,K \to \infty} \mathcal{E}(N,P,K)$, with emergent abilities corresponding to the limiting behavior. This limit theory helps reveal that emergent intelligence originates from the existence of a parameter-limit architecture (referred to as the limit architecture), and that emergent intelligence rationally corresponds to the learning behavior of this limit system. By introducing tools from nonlinear Lipschitz operator theory, we prove that the necessary and sufficient conditions for existence of the limit architecture. Furthermore, we derive the scaling law of foundation models by leveraging tools of Lipschitz operator and covering number. Theoretical results show that: 1) emergent intelligence is governed by three key factors-training steps, data size and the model architecture, where the properties of basic blocks play a crucial role in constructing foundation models; 2) the critical condition Lip(T)=1 for emergent intelligence provides theoretical support for existing findings. 3) emergent intelligence is determined by an infinite-dimensional system, yet can be effectively realized in practice through a finite-dimensional architecture. Our empirical results corroborate these theoretical findings.