77.3NAMar 14
Regge metrics with enhanced traceSnorre H. Christiansen, Ting Lin
We introduce variants of Regge finite element metrics with enhanced properties of the trace. In particular the trace operator is surjective to a finite element space of continuous functions. Multiplying these scalar functions by the identity tensor brings one back to the finite element space of metrics. The metrics can be based on high order polynomials and be constructed on refinements, such as the Clough-Tocher or Worsey-Farin splits. Potential applications to general relativity, incompressible elasticity and conformal geometry are sketched.
LGSep 12, 2023
Interpolation, Approximation and Controllability of Deep Neural NetworksJingpu Cheng, Qianxiao Li, Ting Lin et al.
We investigate the expressive power of deep residual neural networks idealized as continuous dynamical systems through control theory. Specifically, we consider two properties that arise from supervised learning, namely universal interpolation - the ability to match arbitrary input and target training samples - and the closely related notion of universal approximation - the ability to approximate input-target functional relationships via flow maps. Under the assumption of affine invariance of the control family, we give a characterisation of universal interpolation, showing that it holds for essentially any architecture with non-linearity. Furthermore, we elucidate the relationship between universal interpolation and universal approximation in the context of general control systems, showing that the two properties cannot be deduced from each other. At the same time, we identify conditions on the control family and the target function that ensures the equivalence of the two notions.
LGAug 18, 2022
Deep Neural Network Approximation of Invariant Functions through Dynamical SystemsQianxiao Li, Ting Lin, Zuowei Shen
We study the approximation of functions which are invariant with respect to certain permutations of the input indices using flow maps of dynamical systems. Such invariant functions includes the much studied translation-invariant ones involving image tasks, but also encompasses many permutation-invariant functions that finds emerging applications in science and engineering. We prove sufficient conditions for universal approximation of these functions by a controlled equivariant dynamical system, which can be viewed as a general abstraction of deep residual networks with symmetry constraints. These results not only imply the universal approximation for a variety of commonly employed neural network architectures for symmetric function approximation, but also guide the design of architectures with approximation guarantees for applications involving new symmetry requirements.
LGNov 25, 2022
On the Universal Approximation Property of Deep Fully Convolutional Neural NetworksTing Lin, Zuowei Shen, Qianxiao Li
We study the approximation of shift-invariant or equivariant functions by deep fully convolutional networks from the dynamical systems perspective. We prove that deep residual fully convolutional networks and their continuous-layer counterpart can achieve universal approximation of these symmetric functions at constant channel width. Moreover, we show that the same can be achieved by non-residual variants with at least 2 channels in each layer and convolutional kernel size of at least 2. In addition, we show that these requirements are necessary, in the sense that networks with fewer channels or smaller kernels fail to be universal approximators.
70.2NAApr 13
Finite elements for symmetric and traceless tensors in three dimensionsKaibo Hu, Ting Lin, Bowen Shi
We construct a family of finite element sub-complexes of the conformal complex on tetrahedral meshes and show their exactness on contractible domains. This complex includes vector fields and symmetric and traceless tensor fields, connected through the conformal Killing operator, the linearized Cotton-York operator, and the divergence operator, respectively. This leads to discrete versions of transverse traceless (TT) tensors, i.e., symmetric, traceless and divergence-free matrix fields, in continuum mechanics and general relativity. We also show the inf-sup stability of the $H(\operatorname{div})$-conforming finite element symmetric and traceless tensors paired with discontinuous vectors.
52.9LGMar 16
Deep learning and the rate of approximation by flowsJingpu Cheng, Qianxiao Li, Ting Lin et al.
We investigate the dependence of the approximation capacity of deep residual networks on its depth in a continuous dynamical systems setting. This can be formulated as the general problem of quantifying the minimal time-horizon required to approximate a diffeomorphism by flows driven by a given family $\mathcal F$ of vector fields. We show that this minimal time can be identified as a geodesic distance on a sub-Finsler manifold of diffeomorphisms, where the local geometry is characterised by a variational principle involving $\mathcal F$. This connects the learning efficiency of target relationships to their compatibility with the learning architectural choice. Further, the results suggest that the key approximation mechanism in deep learning, namely the approximation of functions by composition or dynamics, differs in a fundamental way from linear approximation theory, where linear spaces and norm-based rate estimates are replaced by manifolds and geodesic distances.
98.2NAApr 3
A Construction of $C^{r}$ Conforming Finite Elements on the Alfeld Split in Any DimensionTing Lin, Hendrik Speleers, Qingyu Wu
Constructing $C^r$ conforming finite element spaces in any dimension is a long-standing problem. For general triangulations, this problem was recently addressed by Hu-Lin-Wu (2024), under certain conditions on supersmoothness and polynomial degree. In this paper, a first unified construction on the Alfeld split in any dimension is given, where the supersmoothness conditions and the polynomial degree requirement are relaxed.
LGJun 30, 2025
A unified framework for establishing the universal approximation of transformer-type architecturesJingpu Cheng, Ting Lin, Zuowei Shen et al.
We investigate the universal approximation property (UAP) of transformer-type architectures, providing a unified theoretical framework that extends prior results on residual networks to models incorporating attention mechanisms. Our work identifies token distinguishability as a fundamental requirement for UAP and introduces a general sufficient condition that applies to a broad class of architectures. Leveraging an analyticity assumption on the attention layer, we can significantly simplify the verification of this condition, providing a non-constructive approach in establishing UAP for such architectures. We demonstrate the applicability of our framework by proving UAP for transformers with various attention mechanisms, including kernel-based and sparse attention mechanisms. The corollaries of our results either generalize prior works or establish UAP for architectures not previously covered. Furthermore, our framework offers a principled foundation for designing novel transformer architectures with inherent UAP guarantees, including those with specific functional symmetries. We propose examples to illustrate these insights.
CLMay 8, 2024
Lightweight Spatial Modeling for Combinatorial Information Extraction From DocumentsYanfei Dong, Lambert Deng, Jiazheng Zhang et al.
Documents that consist of diverse templates and exhibit complex spatial structures pose a challenge for document entity classification. We propose KNN-former, which incorporates a new kind of spatial bias in attention calculation based on the K-nearest-neighbor (KNN) graph of document entities. We limit entities' attention only to their local radius defined by the KNN graph. We also use combinatorial matching to address the one-to-one mapping property that exists in many documents, where one field has only one corresponding entity. Moreover, our method is highly parameter-efficient compared to existing approaches in terms of the number of trainable parameters. Despite this, experiments across various datasets show our method outperforms baselines in most entity types. Many real-world documents exhibit combinatorial properties which can be leveraged as inductive biases to improve extraction accuracy, but existing datasets do not cover these documents. To facilitate future research into these types of documents, we release a new ID document dataset that covers diverse templates and languages. We also release enhanced annotations for an existing dataset.
CVMay 10, 2023
Computational Optics for Mobile Terminals in Mass ProductionShiqi Chen, Ting Lin, Huajun Feng et al.
Correcting the optical aberrations and the manufacturing deviations of cameras is a challenging task. Due to the limitation on volume and the demand for mass production, existing mobile terminals cannot rectify optical degradation. In this work, we systematically construct the perturbed lens system model to illustrate the relationship between the deviated system parameters and the spatial frequency response measured from photographs. To further address this issue, an optimization framework is proposed based on this model to build proxy cameras from the machining samples' SFRs. Engaging with the proxy cameras, we synthetic data pairs, which encode the optical aberrations and the random manufacturing biases, for training the learning-based algorithms. In correcting aberration, although promising results have been shown recently with convolutional neural networks, they are hard to generalize to stochastic machining biases. Therefore, we propose a dilated Omni-dimensional dynamic convolution and implement it in post-processing to account for the manufacturing degradation. Extensive experiments which evaluate multiple samples of two representative devices demonstrate that the proposed optimization framework accurately constructs the proxy camera. And the dynamic processing model is well-adapted to manufacturing deviations of different cameras, realizing perfect computational photography. The evaluation shows that the proposed method bridges the gap between optical design, system machining, and post-processing pipeline, shedding light on the joint of image signal reception (lens and sensor) and image signal processing.
CLFeb 5, 2022
Aspect-based Sentiment Analysis through EDU-level AttentionsTing Lin, Aixin Sun, Yequan Wang
A sentence may express sentiments on multiple aspects. When these aspects are associated with different sentiment polarities, a model's accuracy is often adversely affected. We observe that multiple aspects in such hard sentences are mostly expressed through multiple clauses, or formally known as elementary discourse units (EDUs), and one EDU tends to express a single aspect with unitary sentiment towards that aspect. In this paper, we propose to consider EDU boundaries in sentence modeling, with attentions at both word and EDU levels. Specifically, we highlight sentiment-bearing words in EDU through word-level sparse attention. Then at EDU level, we force the model to attend to the right EDU for the right aspect, by using EDU-level sparse attention and orthogonal regularization. Experiments on three benchmark datasets show that our simple EDU-Attention model outperforms state-of-the-art baselines. Because EDU can be automatically segmented with high accuracy, our model can be applied to sentences directly without the need of manual EDU boundary annotation.
LGDec 22, 2019
Deep Learning via Dynamical Systems: An Approximation PerspectiveQianxiao Li, Ting Lin, Zuowei Shen
We build on the dynamical systems approach to deep learning, where deep residual networks are idealized as continuous-time dynamical systems, from the approximation perspective. In particular, we establish general sufficient conditions for universal approximation using continuous-time deep residual networks, which can also be understood as approximation theories in $L^p$ using flow maps of dynamical systems. In specific cases, rates of approximation in terms of the time horizon are also established. Overall, these results reveal that composition function approximation through flow maps present a new paradigm in approximation theory and contributes to building a useful mathematical framework to investigate deep learning.