Yang Qi

h-index7

9papers

118citations

Novelty45%

AI Score37

Ranked #116,912 of 205,806 authors (top 57%)#25,529 in LG (top 60%)

9 Papers

NAFeb 15, 2016

Uniqueness of Nonnegative Tensor Approximations

Yang Qi, Pierre Comon, Lek-Heng Lim

We show that for a nonnegative tensor, a best nonnegative rank-r approximation is almost always unique, its best rank-one approximation may always be chosen to be a best nonnegative rank-one approximation, and that the set of nonnegative tensors with non-unique best rank-one approximations form an algebraic hypersurface. We show that the last part holds true more generally for real tensors and thereby determine a polynomial equation so that a real or nonnegative tensor which does not satisfy this equation is guaranteed to have a unique best rank-one approximation. We also establish an analogue for real or nonnegative symmetric tensors. In addition, we prove a singular vector variant of the Perron--Frobenius Theorem for positive tensors and apply it to show that a best nonnegative rank-r approximation of a positive tensor can never be obtained by deflation. As an aside, we verify that the Euclidean distance (ED) discriminants of the Segre variety and the Veronese variety are hypersurfaces and give defining equations of these ED discriminants.

AGApr 22, 2018

Topology of tensor ranks

Pierre Comon, Lek-Heng Lim, Yang Qi et al.

We study path-connectedness and homotopy groups of sets of tensors defined by tensor rank, border rank, multilinear rank, as well as their symmetric counterparts for symmetric tensors. We show that over $\mathbb{C}$, the set of rank-$r$ tensors and the set of symmetric rank-$r$ symmetric tensors are both path-connected if $r$ is not more than the complex generic rank; these results also extend to border rank and symmetric border rank over $\mathbb{C}$. Over $\mathbb{R}$, the set of rank-$r$ tensors is path-connected if it has the expected dimension but the corresponding result for symmetric rank-$r$ symmetric $d$-tensors depends on the order $d$: connected when $d$ is odd but not when $d$ is even. Border rank and symmetric border rank over $\mathbb{R}$ have essentially the same path-connectedness properties as rank and symmetric rank over $\mathbb{R}$. When $r$ is greater than the complex generic rank, we are unable to discern any general pattern: For example, we show that border-rank-three tensors in $\mathbb{R}^2 \otimes \mathbb{R}^2 \otimes \mathbb{R}^2$ fall into four connected components. For multilinear rank, the manifold of $d$-tensors of multilinear rank $(r_1,\dots,r_d)$ in $\mathbb{C}^{n_1} \otimes \cdots \otimes \mathbb{C}^{n_d}$ is always path-connected, and the same is true in $\mathbb{R}^{n_1} \otimes \cdots \otimes \mathbb{R}^{n_d}$ unless $n_i = r_i = \prod_{j \ne i} r_j$ for some $i\in\{1, \dots, d\}$. Beyond path-connectedness, we determine, over both $\mathbb{R}$ and $\mathbb{C}$, the fundamental and higher homotopy groups of the set of tensors of a fixed small rank, and, taking advantage of Bott periodicity, those of the manifold of tensors of a fixed multilinear rank. We also obtain analogues of these results for symmetric tensors of a fixed symmetric rank or a fixed symmetric multilinear rank.

AIAug 20, 2025Code

aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists

Pengsong Zhang, Xiang Hu, Guowei Huang et al.

Recent advances in large language models (LLMs) have enabled AI agents to autonomously generate scientific proposals, conduct experiments, author papers, and perform peer reviews. Yet this flood of AI-generated research content collides with a fragmented and largely closed publication ecosystem. Traditional journals and conferences rely on human peer review, making them difficult to scale and often reluctant to accept AI-generated research content; existing preprint servers (e.g. arXiv) lack rigorous quality-control mechanisms. Consequently, a significant amount of high-quality AI-generated research lacks appropriate venues for dissemination, hindering its potential to advance scientific progress. To address these challenges, we introduce aiXiv, a next-generation open-access platform for human and AI scientists. Its multi-agent architecture allows research proposals and papers to be submitted, reviewed, and iteratively refined by both human and AI scientists. It also provides API and MCP interfaces that enable seamless integration of heterogeneous human and AI scientists, creating a scalable and extensible ecosystem for autonomous scientific discovery. Through extensive experiments, we demonstrate that aiXiv is a reliable and robust platform that significantly enhances the quality of AI-generated research proposals and papers after iterative revising and reviewing on aiXiv. Our work lays the groundwork for a next-generation open-access ecosystem for AI scientists, accelerating the publication and dissemination of high-quality AI-generated research content. Code is available at https://github.com/aixiv-org. Website is available at https://forms.gle/DxQgCtXFsJ4paMtn8.

LGMay 22, 2025Code

Stochastic Forward-Forward Learning through Representational Dimensionality Compression

Zhichao Zhu, Yang Qi, Hengyuan Ma et al.

The Forward-Forward (FF) learning algorithm provides a bottom-up alternative to backpropagation (BP) for training neural networks, relying on a layer-wise "goodness" function with well-designed negative samples for contrastive learning. Existing goodness functions are typically defined as the sum of squared postsynaptic activations, neglecting correlated variability between neurons. In this work, we propose a novel goodness function termed dimensionality compression that uses the effective dimensionality (ED) of fluctuating neural responses to incorporate second-order statistical structure. Our objective minimizes ED for noisy copies of individual inputs while maximizing it across the sample distribution, promoting structured representations without the need to prepare negative samples.We demonstrate that this formulation achieves competitive performance compared to other non-BP methods. Moreover, we show that noise plays a constructive role that can enhance generalization and improve inference when predictions are derived from the mean of squared output, which is equivalent to making predictions based on an energy term. Our findings contribute to the development of more biologically plausible learning algorithms and suggest a natural fit for neuromorphic computing, where stochasticity is a computational resource rather than a nuisance. The code is available at https://github.com/ZhichaoZhu/StochasticForwardForward

IVJan 1, 2025

Multi-Center Study on Deep Learning-Assisted Detection and Classification of Fetal Central Nervous System Anomalies Using Ultrasound Imaging

Yang Qi, Jiaxin Cai, Jing Lu et al.

Prenatal ultrasound evaluates fetal growth and detects congenital abnormalities during pregnancy, but the examination of ultrasound images by radiologists requires expertise and sophisticated equipment, which would otherwise fail to improve the rate of identifying specific types of fetal central nervous system (CNS) abnormalities and result in unnecessary patient examinations. We construct a deep learning model to improve the overall accuracy of the diagnosis of fetal cranial anomalies to aid prenatal diagnosis. In our collected multi-center dataset of fetal craniocerebral anomalies covering four typical anomalies of the fetal central nervous system (CNS): anencephaly, encephalocele (including meningocele), holoprosencephaly, and rachischisis, patient-level prediction accuracy reaches 94.5%, with an AUROC value of 99.3%. In the subgroup analyzes, our model is applicable to the entire gestational period, with good identification of fetal anomaly types for any gestational period. Heatmaps superimposed on the ultrasound images not only provide a visual interpretation for the algorithm but also provide an intuitive visual aid to the physician by highlighting key areas that need to be reviewed, helping the physician to quickly identify and validate key areas. Finally, the retrospective reader study demonstrates that by combining the automatic prediction of the DL system with the professional judgment of the radiologist, the diagnostic accuracy and efficiency can be effectively improved and the misdiagnosis rate can be reduced, which has an important clinical application prospect.

LGMay 30, 2023

Probabilistic computation and uncertainty quantification with emerging covariance

Hengyuan Ma, Yang Qi, Li Zhang et al.

Building robust, interpretable, and secure AI system requires quantifying and representing uncertainty under a probabilistic perspective to mimic human cognitive abilities. However, probabilistic computation presents significant challenges for most conventional artificial neural network, as they are essentially implemented in a deterministic manner. In this paper, we develop an efficient probabilistic computation framework by truncating the probabilistic representation of neural activation up to its mean and covariance and construct a moment neural network that encapsulates the nonlinear coupling between the mean and covariance of the underlying stochastic network. We reveal that when only the mean but not the covariance is supervised during gradient-based learning, the unsupervised covariance spontaneously emerges from its nonlinear coupling with the mean and faithfully captures the uncertainty associated with model predictions. Our findings highlight the inherent simplicity of probabilistic computation by seamlessly incorporating uncertainty into model prediction, paving the way for integrating it into large-scale AI systems.

LGJul 2, 2019

Best k-layer neural network approximations

Lek-Heng Lim, Mateusz Michalek, Yang Qi

We show that the empirical risk minimization (ERM) problem for neural networks has no solution in general. Given a training set $s_1, \dots, s_n \in \mathbb{R}^p$ with corresponding responses $t_1,\dots,t_n \in \mathbb{R}^q$, fitting a $k$-layer neural network $ν_θ: \mathbb{R}^p \to \mathbb{R}^q$ involves estimation of the weights $θ\in \mathbb{R}^m$ via an ERM: \[ \inf_{θ\in \mathbb{R}^m} \; \sum_{i=1}^n \lVert t_i - ν_θ(s_i) \rVert_2^2. \] We show that even for $k = 2$, this infimum is not attainable in general for common activations like ReLU, hyperbolic tangent, and sigmoid functions. A high-level explanation is like that for the nonexistence of best rank-$r$ approximations of higher-order tensors --- the set of parameters is not a closed set --- but the geometry involved for best $k$-layer neural networks approximations is more subtle. In addition, we show that for smooth activations $σ(x)= 1/\bigl(1 + \exp(-x)\bigr)$ and $σ(x)=\tanh(x)$, such failure to attain an infimum can happen on a positive-measured subset of responses. For the ReLU activation $σ(x)=\max(0,x)$, we completely classifying cases where the ERM for a best two-layer neural network approximation attains its infimum. As an aside, we obtain a precise description of the geometry of the space of two-layer neural networks with $d$ neurons in the hidden layer: it is the join locus of a line and the $d$-secant locus of a cone.

NASep 6, 2018

Complex best $r$-term approximations almost always exist in finite dimensions

Yang Qi, Mateusz Michałek, Lek-Heng Lim

We show that in finite-dimensional nonlinear approximations, the best $r$-term approximant of a function $f$ almost always exists over $\mathbb{C}$ but that the same is not true over $\mathbb{R}$, i.e., the infimum $\inf_{f_1,\dots,f_r \in Y} \lVert f - f_1 - \dots - f_r \rVert$ is almost always attainable by complex-valued functions $f_1,\dots, f_r$ in $Y$, a set of functions that have some desired structures. Our result extends to functions that possess special properties like symmetry or skew-symmetry under permutations of arguments. For the case where $Y$ is the set of separable functions, the problem becomes that of best rank-$r$ tensor approximations. We show that over $\mathbb{C}$, any tensor almost always has a unique best rank-$r$ approximation. This extends to other notions of tensor ranks such as symmetric rank and alternating rank, to best $r$-block-terms approximations, and to best approximations by tensor networks. When applied to sparse-plus-low-rank approximations, we obtain that for any given $r$ and $k$, a general tensor has a unique best approximation by a sum of a rank-$r$ tensor and a $k$-sparse tensor with a fixed sparsity pattern; this arises in, for example, estimation of covariance matrices of a Gaussian hidden variable model with $k$ observed variables conditionally independent given $r$ hidden variables. The existential (but not the uniqueness) part of our result also applies to best approximations by a sum of a rank-$r$ tensor and a $k$-sparse tensor with no fixed sparsity pattern, as well as to tensor completion problems.

ITMar 4, 2016

Identifiability of an X-rank decomposition of polynomial maps

Pierre Comon, Yang Qi, Konstantin Usevich

In this paper, we study a polynomial decomposition model that arises in problems of system identification, signal processing and machine learning. We show that this decomposition is a special case of the X-rank decomposition --- a powerful novel concept in algebraic geometry that generalizes the tensor CP decomposition. We prove new results on generic/maximal rank and on identifiability of a particular polynomial decomposition model. In the paper, we try to make results and basic tools accessible for general audience (assuming no knowledge of algebraic geometry or its prerequisites).