Shi-Ju Ran

QUANT-PH

h-index19

20papers

222citations

Novelty52%

AI Score44

Ranked #50,461 of 194,257 authors (top 26%)#187 in QUANT-PH (top 20%)

20 Papers

2.3QUANT-PHMar 29, 2022

Quantum compiling with a variational instruction set for accurate and fast quantum computing

Ying Lu, Peng-Fei Zhou, Shao-Ming Fei et al.

The quantum instruction set (QIS) is defined as the quantum gates that are physically realizable by controlling the qubits in quantum hardware. Compiling quantum circuits into the product of the gates in a properly defined QIS is a fundamental step in quantum computing. We here propose the quantum variational instruction set (QuVIS) formed by flexibly designed multi-qubit gates for higher speed and accuracy of quantum computing. The controlling of qubits for realizing the gates in a QuVIS is variationally achieved using the fine-grained time optimization algorithm. Significant reductions in both the error accumulation and time cost are demonstrated in realizing the swaps of multiple qubits and quantum Fourier transformations, compared with the compiling by a standard QIS such as the quantum microinstruction set (QuMIS, formed by several one- and two-qubit gates including one-qubit rotations and controlled-NOT gates). With the same requirement on quantum hardware, the time cost for QuVIS is reduced to less than one half of that for QuMIS. Simultaneously, the error is suppressed algebraically as the depth of the compiled circuit is reduced. As a general compiling approach with high flexibility and efficiency, QuVIS can be defined for different quantum circuits and be adapted to the quantum hardware with different interactions.

6.6QUANT-PHJul 13, 2022

Unsupervised Recognition of Informative Features via Tensor Network Machine Learning and Quantum Entanglement Variations

Sheng-Chen Bai, Yi-Cheng Tang, Shi-Ju Ran

Given an image of a white shoe drawn on a blackboard, how are the white pixels deemed (say by human minds) to be informative for recognizing the shoe without any labeling information on the pixels? Here we investigate such a ``white shoe'' recognition problem from the perspective of tensor network (TN) machine learning and quantum entanglement. Utilizing a generative TN that captures the probability distribution of the features as quantum amplitudes, we propose an unsupervised recognition scheme of informative features with the variations of entanglement entropy (EE) caused by designed measurements. In this way, a given sample, where the values of its features are statistically meaningless, is mapped to the variations of EE that statistically characterize the gain of information. We show that the EE variations identify the features that are critical to recognize this specific sample, and the EE itself reveals the information distribution of the probabilities represented by the TN model. The signs of the variations further reveal the entanglement structures among the features. We test the validity of our scheme on a toy dataset of strip images, the MNIST dataset of hand-drawn digits, the fashion-MNIST dataset of the pictures of fashion articles, and the images of brain cells. Our scheme opens the avenue to the quantum-inspired and interpreted unsupervised learning, which can be applied to, e.g., image segmentation and object detection.

1.2QUANT-PHJul 21, 2023

Persistent Ballistic Entanglement Spreading with Optimal Control in Quantum Spin Chains

Ying Lu, Pei Shi, Xiao-Han Wang et al.

Entanglement propagation provides a key routine to understand quantum many-body dynamics in and out of equilibrium. The entanglement entropy (EE) usually approaches to a sub-saturation known as the Page value $\tilde{S}_{P} =\tilde{S} - dS$ (with $\tilde{S}$ the maximum of EE and $dS$ the Page correction) in, e.g., the random unitary evolutions. The ballistic spreading of EE usually appears in the early time and will be deviated far before the Page value is reached. In this work, we uncover that the magnetic field that maximizes the EE robustly induces persistent ballistic spreading of entanglement in quantum spin chains. The linear growth of EE is demonstrated to persist till the maximal $\tilde{S}$ (along with a flat entanglement spectrum) is reached. The robustness of ballistic spreading and the enhancement of EE under such an optimal control are demonstrated, considering particularly perturbing the initial state by random pure states (RPS's). These are argued as the results from the endomorphism of the time evolution under such an entanglement-enhancing optimal control for the RPS's.

1.2QMMar 11, 2023

Intelligent diagnostic scheme for lung cancer screening with Raman spectra data by tensor network machine learning

Yu-Jia An, Sheng-Chen Bai, Lin Cheng et al.

Artificial intelligence (AI) has brought tremendous impacts on biomedical sciences from academic researches to clinical applications, such as in biomarkers' detection and diagnosis, optimization of treatment, and identification of new therapeutic targets in drug discovery. However, the contemporary AI technologies, particularly deep machine learning (ML), severely suffer from non-interpretability, which might uncontrollably lead to incorrect predictions. Interpretability is particularly crucial to ML for clinical diagnosis as the consumers must gain necessary sense of security and trust from firm grounds or convincing interpretations. In this work, we propose a tensor-network (TN)-ML method to reliably predict lung cancer patients and their stages via screening Raman spectra data of Volatile organic compounds (VOCs) in exhaled breath, which are generally suitable as biomarkers and are considered to be an ideal way for non-invasive lung cancer screening. The prediction of TN-ML is based on the mutual distances of the breath samples mapped to the quantum Hilbert space. Thanks to the quantum probabilistic interpretation, the certainty of the predictions can be quantitatively characterized. The accuracy of the samples with high certainty is almost 100$\%$. The incorrectly-classified samples exhibit obviously lower certainty, and thus can be decipherably identified as anomalies, which will be handled by human experts to guarantee high reliability. Our work sheds light on shifting the ``AI for biomedical sciences'' from the conventional non-interpretable ML schemes to the interpretable human-ML interactive approaches, for the purpose of high accuracy and reliability.

8.3LGMay 16

Confidence Geometry Reveals Trace-Level Correctness in Large Language Model Reasoning

Shuo Liu, Ding Liu, Shi-Ju Ran

Large language models (LLMs) generate not only reasoning text, but also token-level confidence trajectories that record how uncertainty evolves during inference. Whether these trajectories are relevant to reasoning correctness remains unclear. Here we show that confidence trajectories encode a content-agnostic confidence geometry associated with trace-level final-answer correctness. Using only token-level confidence values, without access to the input question, reasoning text, hidden states, or external verifiers, we find that low-dimensional representations of confidence trajectories separate correct from incorrect reasoning traces. Across GSM8K, MATH, and MMLU, this geometric separation is quantitatively linked to downstream predictability: stronger clustering of correct and incorrect traces, measured by the Davies--Bouldin index, consistently corresponds to higher correctness-discrimination AUC. We further show that correctness-related information is enriched in the tail of reasoning, suggesting that late-stage confidence dynamics carry key correctness signals. We propose NeuralConf, a lightweight estimator that learns from confidence trajectories for correctness evaluation. Under a fixed trace budget, NeuralConf-derived scores improve confidence-weighted answer aggregation over majority voting, tail confidence, and other static baselines. These results reveal that LLMs expose trace-intrinsic statistical signals of correctness through their own confidence dynamics, offering a route to improve inference using information already present within generation.

2.1MLAug 8, 2022

Deep Machine Learning Reconstructing Lattice Topology with Strong Thermal Fluctuations

Xiao-Han Wang, Pei Shi, Bin Xi et al.

Applying artificial intelligence to scientific problems (namely AI for science) is currently under hot debate. However, the scientific problems differ much from the conventional ones with images, texts, and etc., where new challenges emerges with the unbalanced scientific data and complicated effects from the physical setups. In this work, we demonstrate the validity of the deep convolutional neural network (CNN) on reconstructing the lattice topology (i.e., spin connectivities) in the presence of strong thermal fluctuations and unbalanced data. Taking the kinetic Ising model with Glauber dynamics as an example, the CNN maps the time-dependent local magnetic momenta (a single-node feature) evolved from a specific initial configuration (dubbed as an evolution instance) to the probabilities of the presences of the possible couplings. Our scheme distinguishes from the previous ones that might require the knowledge on the node dynamics, the responses from perturbations, or the evaluations of statistic quantities such as correlations or transfer entropy from many evolution instances. The fine tuning avoids the "barren plateau" caused by the strong thermal fluctuations at high temperatures. Accurate reconstructions can be made where the thermal fluctuations dominate over the correlations and consequently the statistic methods in general fail. Meanwhile, we unveil the generalization of CNN on dealing with the instances evolved from the unlearnt initial spin configurations and those with the unlearnt lattices. We raise an open question on the learning with unbalanced data in the nearly "double-exponentially" large sample space.

11.3QUANT-PHNov 19, 2023

Tensor networks for interpretable and efficient quantum-inspired machine learning

Shi-Ju Ran, Gang Su

It is a critical challenge to simultaneously gain high interpretability and efficiency with the current schemes of deep machine learning (ML). Tensor network (TN), which is a well-established mathematical tool originating from quantum mechanics, has shown its unique advantages on developing efficient ``white-box'' ML schemes. Here, we give a brief review on the inspiring progresses made in TN-based ML. On one hand, interpretability of TN ML is accommodated with the solid theoretical foundation based on quantum information and many-body physics. On the other hand, high efficiency can be rendered from the powerful TN representations and the advanced computational techniques developed in quantum many-body physics. With the fast development on quantum computers, TN is expected to conceive novel schemes runnable on quantum hardware, heading towards the ``quantum artificial intelligence'' in the forthcoming future.

2.3QUANT-PHOct 13, 2024

Universal scaling laws in quantum-probabilistic machine learning by tensor network towards interpreting representation and generalization powers

Sheng-Chen Bai, Shi-Ju Ran

Interpreting the representation and generalization powers has been a long-standing issue in the field of machine learning (ML) and artificial intelligence. This work contributes to uncovering the emergence of universal scaling laws in quantum-probabilistic ML. We take the generative tensor network (GTN) in the form of a matrix product state as an example and show that with an untrained GTN (such as a random TN state), the negative logarithmic likelihood (NLL) $L$ generally increases linearly with the number of features $M$, i.e., $L \simeq k M + const$. This is a consequence of the so-called ``catastrophe of orthogonality,'' which states that quantum many-body states tend to become exponentially orthogonal to each other as $M$ increases. We reveal that while gaining information through training, the linear scaling law is suppressed by a negative quadratic correction, leading to $L \simeq βM - αM^2 + const$. The scaling coefficients exhibit logarithmic relationships with the number of training samples and the number of quantum channels $χ$. The emergence of the quadratic correction term in NLL for the testing (training) set can be regarded as evidence of the generalization (representation) power of GTN. Over-parameterization can be identified by the deviation in the values of $α$ between training and testing sets while increasing $χ$. We further investigate how orthogonality in the quantum feature map relates to the satisfaction of quantum probabilistic interpretation, as well as to the representation and generalization powers of GTN. The unveiling of universal scaling laws in quantum-probabilistic ML would be a valuable step toward establishing a white-box ML scheme interpreted within the quantum probabilistic framework.

1.2QUANT-PHMay 14, 2024

Universal replication of chaotic characteristics by classical and quantum machine learning

Sheng-Chen Bai, Shi-Ju Ran

Replicating chaotic characteristics of non-linear dynamics by machine learning (ML) has recently drawn wide attentions. In this work, we propose that a ML model, trained to predict the state one-step-ahead from several latest historic states, can accurately replicate the bifurcation diagram and the Lyapunov exponents of discrete dynamic systems. The characteristics for different values of the hyper-parameters are captured universally by a single ML model, while the previous works considered training the ML model independently by fixing the hyper-parameters to be specific values. Our benchmarks on the one- and two-dimensional Logistic maps show that variational quantum circuit can reproduce the long-term characteristics with higher accuracy than the long short-term memory (a well-recognized classical ML model). Our work reveals an essential difference between the ML for the chaotic characteristics and that for standard tasks, from the perspective of the relation between performance and model complexity. Our results suggest that quantum circuit model exhibits potential advantages on mitigating over-fitting, achieving higher accuracy and stability.

12.3LGMay 10, 2023

Compressing Neural Networks Using Tensor Networks with Exponentially Fewer Variational Parameters

Yong Qing, Ke Li, Peng-Fei Zhou et al.

Neural network (NN) designed for challenging machine learning tasks is in general a highly nonlinear mapping that contains massive variational parameters. High complexity of NN, if unbounded or unconstrained, might unpredictably cause severe issues including \R{overfitting}, loss of generalization power, and unbearable cost of hardware. In this work, we propose a general compression scheme that significantly reduces the variational parameters of NN's, despite of their specific types (linear, convolutional, \textit{etc}), by encoding them to deep \R{automatically differentiable} tensor network (ADTN) that contains exponentially-fewer free parameters. Superior compression performance of our scheme is demonstrated on several widely-recognized NN's (FC-2, LeNet-5, AlextNet, ZFNet and VGG-16) and datasets (MNIST, CIFAR-10 and CIFAR-100). For instance, we compress two linear layers in VGG-16 with approximately $10^{7}$ parameters to two ADTN's with just 424 parameters, improving the testing accuracy on CIFAR-10 from $90.17\%$ to $91.74\%$. We argue that the deep structure of ADTN is an essential reason for the remarkable compression performance of ADTN, compared to existing compression schemes that are mainly based on tensor decompositions/factorization and shallow tensor networks. Our work suggests deep TN as an exceptionally efficient mathematical structure for representing the variational parameters of NN's, which exhibits superior compressibility over the commonly-used matrices and multi-way arrays.

1.2QUANT-PHJul 1, 2021

Non-parametric Semi-Supervised Learning in Many-body Hilbert Space with Rescaled Logarithmic Fidelity

Wei-Ming Li, Shi-Ju Ran

In quantum and quantum-inspired machine learning, the very first step is to embed the data in quantum space known as Hilbert space. Developing quantum kernel function (QKF), which defines the distances among the samples in the Hilbert space, belongs to the fundamental topics for machine learning. In this work, we propose the rescaled logarithmic fidelity (RLF) and non-parametric semi-supervised learning in the quantum space, which we name as RLF-NSSL. The rescaling takes advantage of the non-linearity of the kernel to tune the mutual distances of samples in the Hilbert space, and meanwhile avoids the exponentially-small fidelities between quantum many-qubit states. Being non-parametric excludes the possible effects from the variational parameters, and evidently demonstrates the advantages from the space itself. We compare RLF-NSSL with several well-known non-parametric algorithms including naive Bayes classifiers, k-nearest neighbors, and spectral clustering. Our method exhibits better accuracy particularly for the unsupervised case with no labeled samples and the few-shot cases with small numbers of labeled samples. With the visualizations by t-stochastic neighbor embedding, our results imply that the machine learning in the Hilbert space complies with the principles of maximal coding rate reduction, where the low-dimensional data exhibit within-class compressibility, between-class discrimination, and overall diversity. Our proposals can be applied to other quantum and quantum-inspired machine learning, including the methods using the parametric models such as tensor networks, quantum circuits, and quantum neural networks.

2.3QUANT-PHJun 6, 2021

Predicting Quantum Potentials by Deep Neural Network and Metropolis Sampling

Rui Hong, Peng-Fei Zhou, Bin Xi et al.

The hybridizations of machine learning and quantum physics have caused essential impacts to the methodology in both fields. Inspired by quantum potential neural network, we here propose to solve the potential in the Schrodinger equation provided the eigenstate, by combining Metropolis sampling with deep neural network, which we dub as Metropolis potential neural network (MPNN). A loss function is proposed to explicitly involve the energy in the optimization for its accurate evaluation. Benchmarking on the harmonic oscillator and hydrogen atom, MPNN shows excellent accuracy and stability on predicting not just the potential to satisfy the Schrodinger equation, but also the eigen-energy. Our proposal could be potentially applied to the ab-initio simulations, and to inversely solving other partial differential equations in physics and beyond.

3.3QUANT-PHJun 3, 2021

Preparation of Many-body Ground States by Time Evolution with Variational Microscopic Magnetic Fields and Incomplete Interactions

Ying Lu, Yue-Min Li, Peng-Fei Zhou et al.

State preparation is of fundamental importance in quantum physics, which can be realized by constructing the quantum circuit as a unitary that transforms the initial state to the target, or implementing a quantum control protocol to evolve to the target state with a designed Hamiltonian. In this work, we study the latter on quantum many-body systems by the time evolution with fixed couplings and variational magnetic fields. In specific, we consider to prepare the ground states of the Hamiltonians containing certain interactions that are missing in the Hamiltonians for the time evolution. An optimization method is proposed to optimize the magnetic fields by "fine-graining" the discretization of time, in order to gain high precision and stability. The back propagation technique is utilized to obtain the gradients of the fields against the logarithmic fidelity. Our method is tested on preparing the ground state of Heisenberg chain with the time evolution by the XY and Ising interactions, and its performance surpasses two baseline methods that use local and global optimization strategies, respectively. Our work can be applied and generalized to other quantum models such as those defined on higher dimensional lattices. It enlightens to reduce the complexity of the required interactions for implementing quantum control or other tasks in quantum information and computation by means of optimizing the magnetic fields.

2.3QUANT-PHApr 30, 2021

Automatically Differentiable Quantum Circuit for Many-qubit State Preparation

Peng-Fei Zhou, Rui Hong, Shi-Ju Ran

Constructing quantum circuits for efficient state preparation belongs to the central topics in the field of quantum information and computation. As the number of qubits grows fast, methods to derive large-scale quantum circuits are strongly desired. In this work, we propose the automatically differentiable quantum circuit (ADQC) approach to efficiently prepare arbitrary quantum many-qubit states. A key ingredient is to introduce the latent gates whose decompositions give the unitary gates that form the quantum circuit. The circuit is optimized by updating the latent gates using back propagation to minimize the distance between the evolved and target states. Taking the ground states of quantum lattice models and random matrix product states as examples, with the number of qubits where processing the full coefficients is unlikely, ADQC obtains high fidelities with small numbers of layers $N_L \sim O(1)$. Superior accuracy is reached compared with the existing state-preparation approach based on the matrix product disentangler. The parameter complexity of MPS can be significantly reduced by ADQC with the compression ratio $r \sim O(10^{-3})$. Our work sheds light on the "intelligent construction" of quantum circuits for many-qubit systems by combining with the machine learning methods.

5.8LGDec 22, 2020

Residual Matrix Product State for Machine Learning

Ye-Ming Meng, Jing Zhang, Peng Zhang et al.

Tensor network, which originates from quantum physics, is emerging as an efficient tool for classical and quantum machine learning. Nevertheless, there still exists a considerable accuracy gap between tensor network and the sophisticated neural network models for classical machine learning. In this work, we combine the ideas of matrix product state (MPS), the simplest tensor network structure, and residual neural network and propose the residual matrix product state (ResMPS). The ResMPS can be treated as a network where its layers map the "hidden" features to the outputs (e.g., classifications), and the variational parameters of the layers are the functions of the features of the samples (e.g., pixels of images). This is different from neural network, where the layers map feed-forwardly the features to the output. The ResMPS can equip with the non-linear activations and dropout layers, and outperforms the state-of-the-art tensor network models in terms of efficiency, stability, and expression power. Besides, ResMPS is interpretable from the perspective of polynomial expansion, where the factorization and exponential machines naturally emerge. Our work contributes to connecting and hybridizing neural and tensor networks, which is crucial to further enhance our understand of the working mechanisms and improve the performance of both models.

4.3QUANT-PHDec 5, 2020Code

Deep learning Local Reduced Density Matrices for Many-body Hamiltonian Estimation

Xinran Ma, Z. C. Tu, Shi-Ju Ran

Human experts cannot efficiently access the physical information of quantum many-body states by simply "reading" the coefficients, but have to reply on the previous knowledge such as order parameters and quantum measurements. In this work, we demonstrate that convolutional neural network (CNN) can learn from the coefficients of local reduced density matrices to estimate the physical parameters of the many-body Hamiltonians, such as coupling strengths and magnetic fields, provided the states as the ground states. We propose QubismNet that consists of two main parts: the Qubism map that visualizes the ground states (or the purified reduced density matrices) as images, and a CNN that maps the images to the target physical parameters. By assuming certain constraints on the training set for the sake of balance, QubismNet exhibits impressive powers of learning and generalization on several quantum spin models. While the training samples are restricted to the states from certain ranges of the parameters, QubismNet can accurately estimate the parameters of the states beyond such training regions. For instance, our results show that QubismNet can estimate the magnetic fields near the critical point by learning from the states away from the critical vicinity. Our work illuminates a data-driven way to infer the Hamiltonians that give the designed ground states, and therefore would benefit the existing and future generalizations of quantum technologies such as Hamiltonian-based quantum simulations and state tomography.

7.2LGJan 10, 2020

Tangent-Space Gradient Optimization of Tensor Network for Machine Learning

Zheng-zhi Sun, Shi-ju Ran, Gang Su

The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for the probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between the variational parameters and the gradients. The optimization is then implemented by rotating parameter vector towards the direction of gradient. We explain and testify TSGO in tensor network (TN) machine learning, where the TN describes the joint probability distribution as a normalized state $\left| ψ\right\rangle $ in Hilbert space. We show that the gradient can be restricted in the tangent space of $\left\langle ψ\right.\left| ψ\right\rangle = 1$ hyper-sphere. Instead of additional adaptive methods to control the learning rate in deep learning, the learning rate of TSGO is naturally determined by the angle $θ$ as $η= \tan θ$. Our numerical results reveal better convergence of TSGO in comparison to the off-the-shelf Adam.

3.2MLDec 30, 2019Code

Bayesian Tensor Network with Polynomial Complexity for Probabilistic Machine Learning

Shi-Ju Ran

It is known that describing or calculating the conditional probabilities of multiple events is exponentially expensive. In this work, Bayesian tensor network (BTN) is proposed to efficiently capture the conditional probabilities of multiple sets of events with polynomial complexity. BTN is a directed acyclic graphical model that forms a subset of TN. To testify its validity for exponentially many events, BTN is implemented to the image recognition, where the classification is mapped to capturing the conditional probabilities in an exponentially large sample space. Competitive performance is achieved by the BTN with simple tree network structures. Analogous to the tensor network simulations of quantum systems, the validity of the simple-tree BTN implies an ``area law'' of fluctuations in image recognition problems.

13.7LGMar 26, 2019

Generative Tensor Network Classification Model for Supervised Machine Learning

Zheng-Zhi Sun, Cheng Peng, Ding Liu et al.

Tensor network (TN) has recently triggered extensive interests in developing machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-body Hilbert space. The numerical experiments by GTNC show impressive performance on the MNIST and Fashion-MNIST dataset. The testing accuracy is competitive to the state-of-the-art convolutional neural network while higher than the naive Bayes classifier (a generative classifier) and support vector machine. Moreover, GTNC is more efficient than the existing TN models that are in general discriminative. By investigating the distances in the many-body Hilbert space, we find that (a) the samples are naturally clustering in such a space; and (b) bounding the bond dimensions of the TN's to finite values corresponds to removing redundant information in the image recognition. These two characters make GTNC an adaptive and universal model of excellent performance.

11.6MLMar 24, 2018Code

Entanglement-guided architectures of machine learning by quantum tensor network

Yuhan Liu, Xiao Zhang, Maciej Lewenstein et al.

It is a fundamental, but still elusive question whether the schemes based on quantum mechanics, in particular on quantum entanglement, can be used for classical information processing and machine learning. Even partial answer to this question would bring important insights to both fields of machine learning and quantum mechanics. In this work, we implement simple numerical experiments, related to pattern/images classification, in which we represent the classifiers by many-qubit quantum states written in the matrix product states (MPS). Classical machine learning algorithm is applied to these quantum states to learn the classical data. We explicitly show how quantum entanglement (i.e., single-site and bipartite entanglement) can emerge in such represented images. Entanglement characterizes here the importance of data, and such information are practically used to guide the architecture of MPS, and improve the efficiency. The number of needed qubits can be reduced to less than 1/10 of the original number, which is within the access of the state-of-the-art quantum computers. We expect such numerical experiments could open new paths in charactering classical machine learning algorithms, and at the same time shed lights on the generic quantum simulations/computations of machine learning tasks.