Zuowei Shen

LG
h-index5
24papers
1,359citations
Novelty60%
AI Score55

24 Papers

LGMay 19, 2022
Neural Network Architecture Beyond Width and Depth

Zuowei Shen, Haizhao Yang, Shijun Zhang

This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. Neural network architectures with height, width, and depth as hyper-parameters are called three-dimensional architectures. It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional architectures (those with only width and depth as hyper-parameters), e.g., standard fully connected networks. The new network architecture is constructed recursively via a nested structure, and hence we call a network with the new architecture nested network (NestNet). A NestNet of height $s$ is built with each hidden neuron activated by a NestNet of height $\le s-1$. When $s=1$, a NestNet degenerates to a standard network with a two-dimensional architecture. It is proved by construction that height-$s$ ReLU NestNets with $\mathcal{O}(n)$ parameters can approximate $1$-Lipschitz continuous functions on $[0,1]^d$ with an error $\mathcal{O}(n^{-(s+1)/d})$, while the optimal approximation error of standard ReLU networks with $\mathcal{O}(n)$ parameters is $\mathcal{O}(n^{-2/d})$. Furthermore, such a result is extended to generic continuous functions on $[0,1]^d$ with the approximation error characterized by the modulus of continuity. Finally, we use numerical experimentation to show the advantages of the super-approximation power of ReLU NestNets.

LGMar 10, 2022
IAE-Net: Integral Autoencoders for Discretization-Invariant Learning

Yong Zheng Ong, Zuowei Shen, Haizhao Yang

Discretization invariant learning aims at learning in the infinite-dimensional function spaces with the capacity to process heterogeneous discrete representations of functions as inputs and/or outputs of a learning model. This paper proposes a novel deep learning framework based on integral autoencoders (IAE-Net) for discretization invariant learning. The basic building block of IAE-Net consists of an encoder and a decoder as integral transforms with data-driven kernels, and a fully connected neural network between the encoder and decoder. This basic building block is applied in parallel in a wide multi-channel structure, which are repeatedly composed to form a deep and densely connected neural network with skip connections as IAE-Net. IAE-Net is trained with randomized data augmentation that generates training data with heterogeneous structures to facilitate the performance of discretization invariant learning. The proposed IAE-Net is tested with various applications in predictive data science, solving forward and inverse problems in scientific computing, and signal/image processing. Compared with alternatives in the literature, IAE-Net achieves state-of-the-art performance in existing applications and creates a wide range of new applications.

LGSep 12, 2023
Interpolation, Approximation and Controllability of Deep Neural Networks

Jingpu Cheng, Qianxiao Li, Ting Lin et al.

We investigate the expressive power of deep residual neural networks idealized as continuous dynamical systems through control theory. Specifically, we consider two properties that arise from supervised learning, namely universal interpolation - the ability to match arbitrary input and target training samples - and the closely related notion of universal approximation - the ability to approximate input-target functional relationships via flow maps. Under the assumption of affine invariance of the control family, we give a characterisation of universal interpolation, showing that it holds for essentially any architecture with non-linearity. Furthermore, we elucidate the relationship between universal interpolation and universal approximation in the context of general control systems, showing that the two properties cannot be deduced from each other. At the same time, we identify conditions on the control family and the target function that ensures the equivalence of the two notions.

LGAug 18, 2022
Deep Neural Network Approximation of Invariant Functions through Dynamical Systems

Qianxiao Li, Ting Lin, Zuowei Shen

We study the approximation of functions which are invariant with respect to certain permutations of the input indices using flow maps of dynamical systems. Such invariant functions includes the much studied translation-invariant ones involving image tasks, but also encompasses many permutation-invariant functions that finds emerging applications in science and engineering. We prove sufficient conditions for universal approximation of these functions by a controlled equivariant dynamical system, which can be viewed as a general abstraction of deep residual networks with symmetry constraints. These results not only imply the universal approximation for a variety of commonly employed neural network architectures for symmetric function approximation, but also guide the design of architectures with approximation guarantees for applications involving new symmetry requirements.

LGNov 25, 2022
On the Universal Approximation Property of Deep Fully Convolutional Neural Networks

Ting Lin, Zuowei Shen, Qianxiao Li

We study the approximation of shift-invariant or equivariant functions by deep fully convolutional networks from the dynamical systems perspective. We prove that deep residual fully convolutional networks and their continuous-layer counterpart can achieve universal approximation of these symmetric functions at constant channel width. Moreover, we show that the same can be achieved by non-residual variants with at least 2 channels in each layer and convolutional kernel size of at least 2. In addition, we show that these requirements are necessary, in the sense that networks with fewer channels or smaller kernels fail to be universal approximators.

LGJan 23
Multigrade Neural Network Approximation

Shijun Zhang, Zuowei Shen, Yuesheng Xu

We study multigrade deep learning (MGDL) as a principled framework for structured error refinement in deep neural networks. While the approximation power of neural networks is now relatively well understood, training very deep architectures remains challenging due to highly non-convex and often ill-conditioned optimization landscapes. In contrast, for relatively shallow networks, most notably one-hidden-layer $\texttt{ReLU}$ models, training admits convex reformulations with global guarantees, motivating learning paradigms that improve stability while scaling to depth. MGDL builds upon this insight by training deep networks grade by grade: previously learned grades are frozen, and each new residual block is trained solely to reduce the remaining approximation error, yielding an interpretable and stable hierarchical refinement process. We develop an operator-theoretic foundation for MGDL and prove that, for any continuous target function, there exists a fixed-width multigrade $\texttt{ReLU}$ scheme whose residuals decrease strictly across grades and converge uniformly to zero. To the best of our knowledge, this work provides the first rigorous theoretical guarantee that grade-wise training yields provable vanishing approximation error in deep networks. Numerical experiments further illustrate the theoretical results.

52.7LGMar 16
Deep learning and the rate of approximation by flows

Jingpu Cheng, Qianxiao Li, Ting Lin et al.

We investigate the dependence of the approximation capacity of deep residual networks on its depth in a continuous dynamical systems setting. This can be formulated as the general problem of quantifying the minimal time-horizon required to approximate a diffeomorphism by flows driven by a given family $\mathcal F$ of vector fields. We show that this minimal time can be identified as a geodesic distance on a sub-Finsler manifold of diffeomorphisms, where the local geometry is characterised by a variational principle involving $\mathcal F$. This connects the learning efficiency of target relationships to their compatibility with the learning architectural choice. Further, the results suggest that the key approximation mechanism in deep learning, namely the approximation of functions by composition or dynamics, differs in a fundamental way from linear approximation theory, where linear spaces and norm-based rate estimates are replaced by manifolds and geodesic distances.

49.0LGApr 22
Geometric Layer-wise Approximation Rates for Deep Networks

Shijun Zhang, Zuowei Shen, Yuesheng Xu

Depth is widely viewed as a central contributor to the success of deep neural networks, whereas standard neural network approximation theory typically provides guarantees only for the final output and leaves the role of intermediate layers largely unclear. We address this gap by developing a quantitative framework in which depth admits a precise scale-dependent interpretation. Specifically, we design a single shared mixed-activation architecture of fixed width $2dN+d+2$ and any prescribed finite depth such that each intermediate readout $Φ_\ell$ is itself an approximant to the target function $f$. For $f\in L^p([0,1]^d)$ with $p\in [1,\infty)$, the approximation error of $Φ_\ell$ is controlled by $(2d+1)$ times the $L^p$ modulus of continuity at the geometric scale $N^{-\ell}$ for all $\ell$. The estimate reduces to the geometric rate $(2d+1)N^{-\ell}$ if $f$ is $1$-Lipschitz. Our network design is inspired by multigrade deep learning, where depth serves as a progressive refinement mechanism: each new correction targets residual information at a finer scale while the earlier correction terms remain part of the later readouts, yielding a nested architecture that supports adaptive refinement without redesigning the preceding network.

LGJun 30, 2025
A unified framework for establishing the universal approximation of transformer-type architectures

Jingpu Cheng, Ting Lin, Zuowei Shen et al.

We investigate the universal approximation property (UAP) of transformer-type architectures, providing a unified theoretical framework that extends prior results on residual networks to models incorporating attention mechanisms. Our work identifies token distinguishability as a fundamental requirement for UAP and introduces a general sufficient condition that applies to a broad class of architectures. Leveraging an analyticity assumption on the attention layer, we can significantly simplify the verification of this condition, providing a non-constructive approach in establishing UAP for such architectures. We demonstrate the applicability of our framework by proving UAP for transformers with various attention mechanisms, including kernel-based and sparse attention mechanisms. The corollaries of our results either generalize prior works or establish UAP for architectures not previously covered. Furthermore, our framework offers a principled foundation for designing novel transformer architectures with inherent UAP guarantees, including those with specific functional symmetries. We propose examples to illustrate these insights.

LGNov 15, 2021
Deep Network Approximation in Terms of Intrinsic Parameters

Zuowei Shen, Haizhao Yang, Shijun Zhang

One of the arguments to explain the success of deep learning is the powerful approximation capacity of deep neural networks. Such capacity is generally accompanied by the explosive growth of the number of parameters, which, in turn, leads to high computational costs. It is of great interest to ask whether we can achieve successful deep learning with a small number of learnable parameters adapting to the target function. From an approximation perspective, this paper shows that the number of parameters that need to be learned can be significantly smaller than people typically expect. First, we theoretically design ReLU networks with a few learnable parameters to achieve an attractive approximation. We prove by construction that, for any Lipschitz continuous function $f$ on $[0,1]^d$ with a Lipschitz constant $λ>0$, a ReLU network with $n+2$ intrinsic parameters (those depending on $f$) can approximate $f$ with an exponentially small error $5λ\sqrt{d}\,2^{-n}$. Such a result is generalized to generic continuous functions. Furthermore, we show that the idea of learning a small number of parameters to achieve a good approximation can be numerically observed. We conduct several experiments to verify that training a small part of parameters can also achieve good results for classification problems if other parameters are pre-specified or pre-trained from a related problem.

LGJul 6, 2021
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons

Zuowei Shen, Haizhao Yang, Shijun Zhang

This paper develops simple feed-forward neural networks that achieve the universal approximation property for all continuous functions with a fixed finite number of neurons. These neural networks are simple because they are designed with a simple, computable, and continuous activation function $σ$ leveraging a triangular-wave function and the softsign function. We first prove that $σ$-activated networks with width $36d(2d+1)$ and depth $11$ can approximate any continuous function on a $d$-dimensional hypercube within an arbitrarily small error. Hence, for supervised learning and its related regression problems, the hypothesis space generated by these networks with a size not smaller than $36d(2d+1)\times 11$ is dense in the continuous function space $C([a,b]^d)$ and therefore dense in the Lebesgue spaces $L^p([a,b]^d)$ for $p\in [1,\infty)$. Furthermore, we show that classification functions arising from image and signal classification are in the hypothesis space generated by $σ$-activated networks with width $36d(2d+1)$ and depth $12$ when there exist pairwise disjoint bounded closed subsets of $\mathbb{R}^d$ such that the samples of the same class are located in the same subset. Finally, we use numerical experimentation to show that replacing the rectified linear unit (ReLU) activation function by ours would improve the experiment results.

LGFeb 28, 2021
Optimal Approximation Rate of ReLU Networks in terms of Width and Depth

Zuowei Shen, Haizhao Yang, Shijun Zhang

This paper concentrates on the approximation power of deep feed-forward neural networks in terms of width and depth. It is proved by construction that ReLU networks with width $\mathcal{O}\big(\max\{d\lfloor N^{1/d}\rfloor,\, N+2\}\big)$ and depth $\mathcal{O}(L)$ can approximate a Hölder continuous function on $[0,1]^d$ with an approximation rate $\mathcal{O}\big(λ\sqrt{d} (N^2L^2\ln N)^{-α/d}\big)$, where $α\in (0,1]$ and $λ>0$ are Hölder order and constant, respectively. Such a rate is optimal up to a constant in terms of width and depth separately, while existing results are only nearly optimal without the logarithmic factor in the approximation rate. More generally, for an arbitrary continuous function $f$ on $[0,1]^d$, the approximation rate becomes $\mathcal{O}\big(\,\sqrt{d}\,ω_f\big( (N^2L^2\ln N)^{-1/d}\big)\,\big)$, where $ω_f(\cdot)$ is the modulus of continuity. We also extend our analysis to any continuous function $f$ on a bounded set. Particularly, if ReLU networks with depth $31$ and width $\mathcal{O}(N)$ are used to approximate one-dimensional Lipschitz continuous functions on $[0,1]$ with a Lipschitz constant $λ>0$, the approximation rate in terms of the total number of parameters, $W=\mathcal{O}(N^2)$, becomes $\mathcal{O}(\tfracλ{W\ln W})$, which has not been discovered in the literature for fixed-depth ReLU networks.

LGOct 25, 2020
Neural Network Approximation: Three Hidden Layers Are Enough

Zuowei Shen, Haizhao Yang, Shijun Zhang

A three-hidden-layer neural network with super approximation power is introduced. This network is built with the floor function ($\lfloor x\rfloor$), the exponential function ($2^x$), the step function ($1_{x\geq 0}$), or their compositions as the activation function in each neuron and hence we call such networks as Floor-Exponential-Step (FLES) networks. For any width hyper-parameter $N\in\mathbb{N}^+$, it is shown that FLES networks with width $\max\{d,N\}$ and three hidden layers can uniformly approximate a Hölder continuous function $f$ on $[0,1]^d$ with an exponential approximation rate $3λ(2\sqrt{d})^α 2^{-αN}$, where $α\in(0,1]$ and $λ>0$ are the Hölder order and constant, respectively. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $ω_f(\cdot)$, the constructive approximation rate is $2ω_f(2\sqrt{d}){2^{-N}}+ω_f(2\sqrt{d}\,2^{-N})$. Moreover, we extend such a result to general bounded continuous functions on a bounded set $E\subseteq\mathbb{R}^d$. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of $ω_f(r)$ as $r\rightarrow 0$ is moderate (e.g., $ω_f(r)\lesssim r^α$ for Hölder continuous functions), since the major term to be concerned in our approximation rate is essentially $\sqrt{d}$ times a function of $N$ independent of $d$ within the modulus of continuity. Finally, we extend our analysis to derive similar approximation results in the $L^p$-norm for $p\in[1,\infty)$ via replacing Floor-Exponential-Step activation functions by continuous activation functions.

LGJun 22, 2020
Deep Network with Approximation Error Being Reciprocal of Width to Power of Square Root of Depth

Zuowei Shen, Haizhao Yang, Shijun Zhang

A new network with super approximation power is introduced. This network is built with Floor ($\lfloor x\rfloor$) or ReLU ($\max\{0,x\}$) activation function in each neuron and hence we call such networks Floor-ReLU networks. For any hyper-parameters $N\in\mathbb{N}^+$ and $L\in\mathbb{N}^+$, it is shown that Floor-ReLU networks with width $\max\{d,\, 5N+13\}$ and depth $64dL+3$ can uniformly approximate a Hölder function $f$ on $[0,1]^d$ with an approximation error $3λd^{α/2}N^{-α\sqrt{L}}$, where $α\in(0,1]$ and $λ$ are the Hölder order and constant, respectively. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $ω_f(\cdot)$, the constructive approximation rate is $ω_f(\sqrt{d}\,N^{-\sqrt{L}})+2ω_f(\sqrt{d}){N^{-\sqrt{L}}}$. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of $ω_f(r)$ as $r\to 0$ is moderate (e.g., $ω_f(r) \lesssim r^α$ for Hölder continuous functions), since the major term to be considered in our approximation rate is essentially $\sqrt{d}$ times a function of $N$ and $L$ independent of $d$ within the modulus of continuity.

LGApr 18, 2020
Optimization in Machine Learning: A Distribution Space Approach

Yongqiang Cai, Qianxiao Li, Zuowei Shen

We present the viewpoint that optimization problems encountered in machine learning can often be interpreted as minimizing a convex functional over a function space, but with a non-convex constraint set introduced by model parameterization. This observation allows us to repose such problems via a suitable relaxation as convex optimization problems in the space of distributions over the training parameters. We derive some simple relationships between the distribution-space problem and the original problem, e.g. a distribution-space solution is at least as good as a solution in the original space. Moreover, we develop a numerical algorithm based on mixture distributions to perform approximate optimization directly in distribution space. Consistency of this approximation is established and the numerical efficacy of the proposed algorithm is illustrated on simple examples. In both theory and practice, this formulation provides an alternative approach to large-scale optimization in machine learning.

LGJan 9, 2020
Deep Network Approximation for Smooth Functions

Jianfeng Lu, Zuowei Shen, Haizhao Yang et al.

This paper establishes the (nearly) optimal approximation error characterization of deep rectified linear unit (ReLU) networks for smooth functions in terms of both width and depth simultaneously. To that end, we first prove that multivariate polynomials can be approximated by deep ReLU networks of width $\mathcal{O}(N)$ and depth $\mathcal{O}(L)$ with an approximation error $\mathcal{O}(N^{-L})$. Through local Taylor expansions and their deep ReLU network approximations, we show that deep ReLU networks of width $\mathcal{O}(N\ln N)$ and depth $\mathcal{O}(L\ln L)$ can approximate $f\in C^s([0,1]^d)$ with a nearly optimal approximation error $\mathcal{O}(\|f\|_{C^s([0,1]^d)}N^{-2s/d}L^{-2s/d})$. Our estimate is non-asymptotic in the sense that it is valid for arbitrary width and depth specified by $N\in\mathbb{N}^+$ and $L\in\mathbb{N}^+$, respectively.

LGDec 22, 2019
Deep Learning via Dynamical Systems: An Approximation Perspective

Qianxiao Li, Ting Lin, Zuowei Shen

We build on the dynamical systems approach to deep learning, where deep residual networks are idealized as continuous-time dynamical systems, from the approximation perspective. In particular, we establish general sufficient conditions for universal approximation using continuous-time deep residual networks, which can also be understood as approximation theories in $L^p$ using flow maps of dynamical systems. In specific cases, rates of approximation in terms of the time horizon are also established. Overall, these results reveal that composition function approximation through flow maps present a new paradigm in approximation theory and contributes to building a useful mathematical framework to investigate deep learning.

NAJun 13, 2019
Deep Network Approximation Characterized by Number of Neurons

Zuowei Shen, Haizhao Yang, Shijun Zhang

This paper quantitatively characterizes the approximation power of deep feed-forward neural networks (FNNs) in terms of the number of neurons. It is shown by construction that ReLU FNNs with width $\mathcal{O}\big(\max\{d\lfloor N^{1/d}\rfloor,\, N+1\}\big)$ and depth $\mathcal{O}(L)$ can approximate an arbitrary Hölder continuous function of order $α\in (0,1]$ on $[0,1]^d$ with a nearly tight approximation rate $\mathcal{O}\big(\sqrt{d} N^{-2α/d}L^{-2α/d}\big)$ measured in $L^p$-norm for any $N,L\in \mathbb{N}^+$ and $p\in[1,\infty]$. More generally for an arbitrary continuous function $f$ on $[0,1]^d$ with a modulus of continuity $ω_f(\cdot)$, the constructive approximation rate is $\mathcal{O}\big(\sqrt{d}\,ω_f( N^{-2/d}L^{-2/d})\big)$. We also extend our analysis to $f$ on irregular domains or those localized in an $\varepsilon$-neighborhood of a $d_{\mathcal{M}}$-dimensional smooth manifold $\mathcal{M}\subseteq [0,1]^d$ with $d_{\mathcal{M}}\ll d$. Especially, in the case of an essentially low-dimensional domain, we show an approximation rate $\mathcal{O}\big(ω_f(\tfrac{\varepsilon}{1-δ}\sqrt{\tfrac{d}{d_δ}}+\varepsilon)+\sqrt{d}\,ω_f(\tfrac{\sqrt{d}}{(1-δ)\sqrt{d_δ}}N^{-2/d_δ}L^{-2/d_δ})\big)$ for ReLU FNNs to approximate $f$ in the $\varepsilon$-neighborhood, where $d_δ=\mathcal{O}\big(d_{\mathcal{M}}\tfrac{\ln (d/δ)}{δ^2}\big)$ for any $δ\in(0,1)$ as a relative error for a projection to approximate an isometry when projecting $\mathcal{M}$ to a $d_δ$-dimensional domain.

LGFeb 26, 2019
Nonlinear Approximation via Compositions

Zuowei Shen, Haizhao Yang, Shijun Zhang

Given a function dictionary $\cal D$ and an approximation budget $N\in\mathbb{N}^+$, nonlinear approximation seeks the linear combination of the best $N$ terms $\{T_n\}_{1\le n\le N}\subseteq{\cal D}$ to approximate a given function $f$ with the minimum approximation error\[\varepsilon_{L,f}:=\min_{\{g_n\}\subseteq{\mathbb{R}},\{T_n\}\subseteq{\cal D}}\|f(x)-\sum_{n=1}^N g_n T_n(x)\|.\]Motivated by recent success of deep learning, we propose dictionaries with functions in a form of compositions, i.e.,\[T(x)=T^{(L)}\circ T^{(L-1)}\circ\cdots\circ T^{(1)}(x)\]for all $T\in\cal D$, and implement $T$ using ReLU feed-forward neural networks (FNNs) with $L$ hidden layers. We further quantify the improvement of the best $N$-term approximation rate in terms of $N$ when $L$ is increased from $1$ to $2$ or $3$ to show the power of compositions. In the case when $L>3$, our analysis shows that increasing $L$ cannot improve the approximation rate in terms of $N$. In particular, for any function $f$ on $[0,1]$, regardless of its smoothness and even the continuity, if $f$ can be approximated using a dictionary when $L=1$ with the best $N$-term approximation rate $\varepsilon_{L,f}={\cal O}(N^{-η})$, we show that dictionaries with $L=2$ can improve the best $N$-term approximation rate to $\varepsilon_{L,f}={\cal O}(N^{-2η})$. We also show that for Hölder continuous functions of order $α$ on $[0,1]^d$, the application of a dictionary with $L=3$ in nonlinear approximation can achieve an essentially tight best $N$-term approximation rate $\varepsilon_{L,f}={\cal O}(N^{-2α/d})$. Finally, we show that dictionaries consisting of wide FNNs with a few hidden layers are more attractive in terms of computational efficiency than dictionaries with narrow and very deep FNNs for approximating Hölder continuous functions if the number of computer cores is larger than $N$ in parallel computing.

LGSep 29, 2018
A Quantitative Analysis of the Effect of Batch Normalization on Gradient Descent

Yongqiang Cai, Qianxiao Li, Zuowei Shen

Despite its empirical success and recent theoretical progress, there generally lacks a quantitative analysis of the effect of batch normalization (BN) on the convergence and stability of gradient descent. In this paper, we provide such an analysis on the simple problem of ordinary least squares (OLS). Since precise dynamical properties of gradient descent (GD) is completely known for the OLS problem, it allows us to isolate and compare the additional effects of BN. More precisely, we show that unlike GD, gradient descent with BN (BNGD) converges for arbitrary learning rates for the weights, and the convergence remains linear under mild conditions. Moreover, we quantify two different sources of acceleration of BNGD over GD -- one due to over-parameterization which improves the effective condition number and another due having a large range of learning rates giving rise to fast descent. These phenomena set BNGD apart from GD and could account for much of its robustness properties. These findings are confirmed quantitatively by numerical experiments, which further show that many of the uncovered properties of BNGD in OLS are also observed qualitatively in more complex supervised learning problems.

FAFeb 17, 2016
Image Restoration: A General Wavelet Frame Based Model and Its Asymptotic Analysis

Bin Dong, Zuowei Shen, Peichu Xie

Image restoration is one of the most important areas in imaging science. Mathematical tools have been widely used in image restoration, where wavelet frame based approach is one of the successful examples. In this paper, we introduce a generic wavelet frame based image restoration model, called the "general model", which includes most of the existing wavelet frame based models as special cases. Moreover, the general model also includes examples that are new to the literature. Motivated by our earlier studies [1-3], We provide an asymptotic analysis of the general model as image resolution goes to infinity, which establishes a connection between the general model in discrete setting and a new variatonal model in continuum setting. The variational model also includes some of the existing variational models as special cases, such as the total generalized variational model proposed by [4]. In the end, we introduce an algorithm solving the general model and present one numerical simulation as an example.

CVFeb 8, 2013
Surveillance Video Processing Using Compressive Sensing

Hong Jiang, Wei Deng, Zuowei Shen

A compressive sensing method combined with decomposition of a matrix formed with image frames of a surveillance video into low rank and sparse matrices is proposed to segment the background and extract moving objects in a surveillance video. The video is acquired by compressive measurements, and the measurements are used to reconstruct the video by a low rank and sparse decomposition of matrix. The low rank component represents the background, and the sparse component is used to identify moving objects in the surveillance video. The decomposition is performed by an augmented Lagrangian alternating direction method. Experiments are carried out to demonstrate that moving objects can be reliably extracted with a small amount of measurements.

ITFeb 6, 2013
Adaptive low rank and sparse decomposition of video using compressive sensing

Fei Yang, Hong Jiang, Zuowei Shen et al.

We address the problem of reconstructing and analyzing surveillance videos using compressive sensing. We develop a new method that performs video reconstruction by low rank and sparse decomposition adaptively. Background subtraction becomes part of the reconstruction. In our method, a background model is used in which the background is learned adaptively as the compressive measurements are processed. The adaptive method has low latency, and is more robust than previous methods. We will present experimental results to demonstrate the advantages of the proposed method.

NADec 22, 2009
A Unitary Extension Principle for Shearlet Systems

Bin Han, Gitta Kutyniok, Zuowei Shen

In this paper, we first introduce the concept of an adaptive MRA (AMRA) structure which is a variant of the classical MRA structure suited to the main goal of a fast flexible decomposition strategy adapted to the data at each decomposition level. We then study this novel methodology for the general case of affine-like systems, and derive a Unitary Extension Principle (UEP) for filter design. Finally, we apply our results to the directional representation system of shearlets. This leads to a comprehensive theory for fast decomposition algorithms associated with shearlet systems which encompasses tight shearlet frames with spatially compactly supported generators within such an AMRA structure. Also shearlet-like systems associated with parabolic scaling and unimodular matrices optimally close to rotation as well as 3D shearlet systems are studied within this framework.