CLMay 25
A general tensor-structured compression scheme for efficient large language modelsYing Lu, Peng-Fei Zhou, Qi-Xuan Fang et al.
Large language models (LLMs) are dominated by dense linear transformations, whose storage, memory and computational overheads hinder efficient adaptation and deployment while masking the functional impacts of structural simplification. Here we present Tensor Mixture (MixT), a general tensor-structured compression scheme that replaces targeted dense linear layers with natively executable mixtures of tensor operators. Operating directly on generic linear projections instead of model-specific components, MixT is potentially applicable across Transformer-based LLMs and other dense neural mappings. We evaluate MixT on Qwen3-8B and LLaMA2-7B under a unified recovery protocol, identifying a broad compressible regime in which MMLU accuracy is largely preserved before an abrupt transition at model-specific boundaries. This transition coincides with coordinated shifts in output entropy, prediction entropy and inter-layer geometry. At the LLaMA2-7B transition boundary, MixT reduces full-model parameters by 47.5\%, inference FLOPs by 37.1\%, training FLOPs by 52.1\% and peak inference memory by 60.4\%, demonstrating its practical potential for lower-cost LLM compression.
QMMar 11, 2023
Intelligent diagnostic scheme for lung cancer screening with Raman spectra data by tensor network machine learningYu-Jia An, Sheng-Chen Bai, Lin Cheng et al.
Artificial intelligence (AI) has brought tremendous impacts on biomedical sciences from academic researches to clinical applications, such as in biomarkers' detection and diagnosis, optimization of treatment, and identification of new therapeutic targets in drug discovery. However, the contemporary AI technologies, particularly deep machine learning (ML), severely suffer from non-interpretability, which might uncontrollably lead to incorrect predictions. Interpretability is particularly crucial to ML for clinical diagnosis as the consumers must gain necessary sense of security and trust from firm grounds or convincing interpretations. In this work, we propose a tensor-network (TN)-ML method to reliably predict lung cancer patients and their stages via screening Raman spectra data of Volatile organic compounds (VOCs) in exhaled breath, which are generally suitable as biomarkers and are considered to be an ideal way for non-invasive lung cancer screening. The prediction of TN-ML is based on the mutual distances of the breath samples mapped to the quantum Hilbert space. Thanks to the quantum probabilistic interpretation, the certainty of the predictions can be quantitatively characterized. The accuracy of the samples with high certainty is almost 100$\%$. The incorrectly-classified samples exhibit obviously lower certainty, and thus can be decipherably identified as anomalies, which will be handled by human experts to guarantee high reliability. Our work sheds light on shifting the ``AI for biomedical sciences'' from the conventional non-interpretable ML schemes to the interpretable human-ML interactive approaches, for the purpose of high accuracy and reliability.
AIJun 12, 2025Code
Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry ChallengesJintao Liang, Gang Su, Huifeng Lin et al.
Retrieval-Augmented Generation (RAG) has emerged as a powerful framework to overcome the knowledge limitations of Large Language Models (LLMs) by integrating external retrieval with language generation. While early RAG systems based on static pipelines have shown effectiveness in well-structured tasks, they struggle in real-world scenarios requiring complex reasoning, dynamic retrieval, and multi-modal integration. To address these challenges, the field has shifted toward Reasoning Agentic RAG, a paradigm that embeds decision-making and adaptive tool use directly into the retrieval process. In this paper, we present a comprehensive review of Reasoning Agentic RAG methods, categorizing them into two primary systems: predefined reasoning, which follows fixed modular pipelines to boost reasoning, and agentic reasoning, where the model autonomously orchestrates tool interaction during inference. We analyze representative techniques under both paradigms, covering architectural design, reasoning strategies, and tool coordination. Finally, we discuss key research challenges and propose future directions to advance the flexibility, robustness, and applicability of reasoning agentic RAG systems. Our collection of the relevant research has been organized into a https://github.com/ByebyeMonica/Reasoning-Agentic-RAG.
LGJul 15, 2024
Transformer-based Drum-level Prediction in a Boiler Plant with Delayed Relations among MultivariatesGang Su, Sun Yang, Zhishuai Li
The steam drum water level is a critical parameter that directly impacts the safety and efficiency of power plant operations. However, predicting the drum water level in boilers is challenging due to complex non-linear process dynamics originating from long-time delays and interrelations, as well as measurement noise. This paper investigates the application of Transformer-based models for predicting drum water levels in a steam boiler plant. Leveraging the capabilities of Transformer architectures, this study aims to develop an accurate and robust predictive framework to anticipate water level fluctuations and facilitate proactive control strategies. To this end, a prudent pipeline is proposed, including 1) data preprocess, 2) causal relation analysis, 3) delay inference, 4) variable augmentation, and 5) prediction. Through extensive experimentation and analysis, the effectiveness of Transformer-based approaches in steam drum water level prediction is evaluated, highlighting their potential to enhance operational stability and optimize plant performance.
QUANT-PHNov 19, 2023
Tensor networks for interpretable and efficient quantum-inspired machine learningShi-Ju Ran, Gang Su
It is a critical challenge to simultaneously gain high interpretability and efficiency with the current schemes of deep machine learning (ML). Tensor network (TN), which is a well-established mathematical tool originating from quantum mechanics, has shown its unique advantages on developing efficient ``white-box'' ML schemes. Here, we give a brief review on the inspiring progresses made in TN-based ML. On one hand, interpretability of TN ML is accommodated with the solid theoretical foundation based on quantum information and many-body physics. On the other hand, high efficiency can be rendered from the powerful TN representations and the advanced computational techniques developed in quantum many-body physics. With the fast development on quantum computers, TN is expected to conceive novel schemes runnable on quantum hardware, heading towards the ``quantum artificial intelligence'' in the forthcoming future.
LGJan 10, 2020
Tangent-Space Gradient Optimization of Tensor Network for Machine LearningZheng-zhi Sun, Shi-ju Ran, Gang Su
The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for the probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between the variational parameters and the gradients. The optimization is then implemented by rotating parameter vector towards the direction of gradient. We explain and testify TSGO in tensor network (TN) machine learning, where the TN describes the joint probability distribution as a normalized state $\left| ψ\right\rangle $ in Hilbert space. We show that the gradient can be restricted in the tangent space of $\left\langle ψ\right.\left| ψ\right\rangle = 1$ hyper-sphere. Instead of additional adaptive methods to control the learning rate in deep learning, the learning rate of TSGO is naturally determined by the angle $θ$ as $η= \tan θ$. Our numerical results reveal better convergence of TSGO in comparison to the off-the-shelf Adam.
MLJul 24, 2019
Quantum Compressed Sensing with Unsupervised Tensor-Network Machine LearningShi-Ju Ran, Zheng-Zhi Sun, Shao-Ming Fei et al.
We propose tensor-network compressed sensing (TNCS) by combining the ideas of compressed sensing, tensor network (TN), and machine learning, which permits novel and efficient quantum communications of realistic data. The strategy is to use the unsupervised TN machine learning algorithm to obtain the entangled state $|Ψ\rangle$ that describes the probability distribution of a huge amount of classical information considered to be communicated. To transfer a specific piece of information with $|Ψ\rangle$, our proposal is to encode such information in the separable state with the minimal distance to the measured state $|Φ\rangle$ that is obtained by partially measuring on $|Ψ\rangle$ in a designed way. To this end, a measuring protocol analogous to the compressed sensing with neural-network machine learning is suggested, where the measurements are designed to minimize uncertainty of information from the probability distribution given by $|Φ\rangle$. In this way, those who have $|Φ\rangle$ can reliably access the information by simply measuring on $|Φ\rangle$. We propose q-sparsity to characterize the sparsity of quantum states and the efficiency of the quantum communications by TNCS. The high q-sparsity is essentially due to the fact that the TN states describing nicely the probability distribution obey the area law of entanglement entropy. Testing on realistic datasets (hand-written digits and fashion images), TNCS is shown to possess high efficiency and accuracy, where the security of communications is guaranteed by the fundamental quantum principles.
LGMar 26, 2019
Generative Tensor Network Classification Model for Supervised Machine LearningZheng-Zhi Sun, Cheng Peng, Ding Liu et al.
Tensor network (TN) has recently triggered extensive interests in developing machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-body Hilbert space. The numerical experiments by GTNC show impressive performance on the MNIST and Fashion-MNIST dataset. The testing accuracy is competitive to the state-of-the-art convolutional neural network while higher than the naive Bayes classifier (a generative classifier) and support vector machine. Moreover, GTNC is more efficient than the existing TN models that are in general discriminative. By investigating the distances in the many-body Hilbert space, we find that (a) the samples are naturally clustering in such a space; and (b) bounding the bond dimensions of the TN's to finite values corresponds to removing redundant information in the image recognition. These two characters make GTNC an adaptive and universal model of excellent performance.
MLOct 13, 2017
Machine Learning by Unitary Tensor Network of Hierarchical Tree StructureDing Liu, Shi-Ju Ran, Peter Wittek et al.
The resemblance between the methods used in quantum-many body physics and in machine learning has drawn considerable attention. In particular, tensor networks (TNs) and deep learning architectures bear striking similarities to the extent that TNs can be used for machine learning. Previous results used one-dimensional TNs in image recognition, showing limited scalability and flexibilities. In this work, we train two-dimensional hierarchical TNs to solve image recognition problems, using a training algorithm derived from the multi-scale entanglement renormalization ansatz. This approach introduces mathematical connections among quantum many-body physics, quantum information theory, and machine learning. While keeping the TN unitary in the training phase, TN states are defined, which encode classes of images into quantum many-body states. We study the quantum features of the TN states, including quantum entanglement and fidelity. We find these quantities could be properties that characterize the image classes, as well as the machine learning tasks.