AINov 2, 2025
How Focused Are LLMs? A Quantitative Study via Repetitive Deterministic Prediction TasksWanda Hou, Leon Zhou, Hong-Ye Hu et al.
We investigate the performance of large language models on repetitive deterministic prediction tasks and study how the sequence accuracy rate scales with output length. Each such task involves repeating the same operation n times. Examples include letter replacement in strings following a given rule, integer addition, and multiplication of string operators in many body quantum mechanics. If the model performs the task through a simple repetition algorithm, the success rate should decay exponentially with sequence length. In contrast, our experiments on leading large language models reveal a sharp double exponential drop beyond a characteristic length scale, forming an accuracy cliff that marks the transition from reliable to unstable generation. This indicates that the models fail to execute each operation independently. To explain this phenomenon, we propose a statistical physics inspired model that captures the competition between external conditioning from the prompt and internal interference among generated tokens. The model quantitatively reproduces the observed crossover and provides an interpretable link between attention induced interference and sequence level failure. Fitting the model to empirical results across multiple models and tasks yields effective parameters that characterize the intrinsic error rate and error accumulation factor for each model task pair, offering a principled framework for understanding the limits of deterministic accuracy in large language models.
QUANT-PHFeb 27, 2024
Demonstration of Robust and Efficient Quantum Property Learning with Shallow ShadowsHong-Ye Hu, Andi Gu, Swarnadeep Majumder et al.
Extracting information efficiently from quantum systems is a major component of quantum information processing tasks. Randomized measurements, or classical shadows, enable predicting many properties of arbitrary quantum states using few measurements. While random single-qubit measurements are experimentally friendly and suitable for learning low-weight Pauli observables, they perform poorly for nonlocal observables. Prepending a shallow random quantum circuit before measurements maintains this experimental friendliness, but also has favorable sample complexities for observables beyond low-weight Paulis, including high-weight Paulis and global low-rank properties such as fidelity. However, in realistic scenarios, quantum noise accumulated with each additional layer of the shallow circuit biases the results. To address these challenges, we propose the \emph{robust shallow shadows protocol}. Our protocol uses Bayesian inference to learn the experimentally relevant noise model and mitigate it in postprocessing. This mitigation introduces a bias-variance trade-off: correcting for noise-induced bias comes at the cost of a larger estimator variance. Despite this increased variance, as we demonstrate on a superconducting quantum processor, our protocol correctly recovers state properties such as expectation values, fidelity, and entanglement entropy, while maintaining a lower sample complexity compared to the random single qubit measurement scheme. We also theoretically analyze the effects of noise on sample complexity and show how the optimal choice of the shallow shadow depth varies with noise strength. This combined theoretical and experimental analysis positions the robust shallow shadow protocol as a scalable, robust, and sample-efficient protocol for characterizing quantum states on current quantum computing platforms.
QUANT-PHMay 4
Opportunities and challenges in scaling quantum error detection on hardwareYanis Le Fur, Ethan Egger, Hong-Ye Hu et al.
Quantum error detection can produce unbiased expectation values that exponentially converge to noiseless results as the code distance is increased. Despite this, its performance as an error mitigation technique is relatively understudied on quantum hardware because of its two main drawbacks: (i) the number of samples increases exponentially in the circuit depth/noise level, and (ii) the classical processing generally grows exponentially in the code distance, though exceptions exist. Additionally, the constant (but often large) overhead of embedding the code and logical operations on hardware can make accuracy worse instead of better. In this work, we seek to provide a clear picture of these opportunities and challenges for scaling quantum error detection on hardware. We do so by performing a detailed benchmarking study on real and simulated noisy quantum computers, using the repetition code and triangular color code for memory experiments and logical computations with up to $74$ physical qubits. In addition to these benchmarks, we estimate the pseudothreshold of codes to map the frontier of error detection on current and future quantum computers. Despite the challenges, our results show strong promise for scaling quantum error detection on hardware.
QUANT-PHJan 5, 2024
Digital-analog quantum learning on Rydberg atom arraysJonathan Z. Lu, Lucy Jiao, Kristina Wolinski et al.
We propose hybrid digital-analog learning algorithms on Rydberg atom arrays, combining the potentially practical utility and near-term realizability of quantum learning with the rapidly scaling architectures of neutral atoms. Our construction requires only single-qubit operations in the digital setting and global driving according to the Rydberg Hamiltonian in the analog setting. We perform a comprehensive numerical study of our algorithm on both classical and quantum data, given respectively by handwritten digit classification and unsupervised quantum phase boundary learning. We show in the two representative problems that digital-analog learning is not only feasible in the near term, but also requires shorter circuit depths and is more robust to realistic error models as compared to digital learning schemes. Our results suggest that digital-analog learning opens a promising path towards improved variational quantum learning experiments in the near term.
QUANT-PHFeb 17, 2025
Ansatz-free Hamiltonian learning with Heisenberg-limited scalingHong-Ye Hu, Muzhou Ma, Weiyuan Gong et al.
Learning the unknown interactions that govern a quantum system is crucial for quantum information processing, device benchmarking, and quantum sensing. The problem, known as Hamiltonian learning, is well understood under the assumption that interactions are local, but this assumption may not hold for arbitrary Hamiltonians. Previous methods all require high-order inverse polynomial dependency with precision, unable to surpass the standard quantum limit and reach the gold standard Heisenberg-limited scaling. Whether Heisenberg-limited Hamiltonian learning is possible without prior assumptions about the interaction structures, a challenge we term \emph{ansatz-free Hamiltonian learning}, remains an open question. In this work, we present a quantum algorithm to learn arbitrary sparse Hamiltonians without any structure constraints using only black-box queries of the system's real-time evolution and minimal digital controls to attain Heisenberg-limited scaling in estimation error. Our method is also resilient to state-preparation-and-measurement errors, enhancing its practical feasibility. We numerically demonstrate our ansatz-free protocol for learning physical Hamiltonians and validating analog quantum simulations, benchmarking our performance against the state-of-the-art Heisenberg-limited learning approach. Moreover, we establish a fundamental trade-off between total evolution time and quantum control on learning arbitrary interactions, revealing the intrinsic interplay between controllability and total evolution time complexity for any learning algorithm. These results pave the way for further exploration into Heisenberg-limited Hamiltonian learning in complex quantum systems under minimal assumptions, potentially enabling new benchmarking and verification protocols.
QUANT-PHAug 26, 2025
Universal Dynamics with Globally Controlled Analog Quantum SimulatorsHong-Ye Hu, Abigail McClain Gomez, Liyuan Chen et al.
Analog quantum simulators with global control fields have emerged as powerful platforms for exploring complex quantum phenomena. Recent breakthroughs, such as the coherent control of thousands of atoms, highlight the growing potential for quantum applications at scale. Despite these advances, a fundamental theoretical question remains unresolved: to what extent can such systems realize universal quantum dynamics under global control? Here we establish a necessary and sufficient condition for universal quantum computation using only global pulse control, proving that a broad class of analog quantum simulators is, in fact, universal. We further extend this framework to fermionic and bosonic systems, including modern platforms such as ultracold atoms in optical superlattices. Crucially, to connect the theoretical possibility with experimental reality, we introduce a new control technique into the experiment - direct quantum optimal control. This method enables the synthesis of complex effective Hamiltonians and allows us to incorporate realistic hardware constraints. To show its practical power, we experimentally engineer three-body interactions outside the blockade regime and demonstrate topological dynamics on a Rydberg atom array. Using the new control framework, we overcome key experimental challenges, including hardware limitations and atom position fluctuations in the non-blockade regime, by identifying smooth, short-duration pulses that achieve high-fidelity dynamics. Experimental measurements reveal dynamical signatures of symmetry-protected-topological edge modes, confirming both the expressivity and feasibility of our approach. Our work opens a new avenue for quantum simulation beyond native hardware Hamiltonians, enabling the engineering of effective multi-body interactions and advancing the frontier of quantum information processing with globally-controlled analog platforms.
QUANT-PHDec 25, 2024
Derandomized shallow shadows: Efficient Pauli learning with bounded-depth circuitsKatherine Van Kirk, Christian Kokail, Jonathan Kunjummen et al.
Efficiently estimating large numbers of non-commuting observables is an important subroutine of many quantum science tasks. We present the derandomized shallow shadows (DSS) algorithm for efficiently learning a large set of non-commuting observables, using shallow circuits to rotate into measurement bases. Exploiting tensor network techniques to ensure polynomial scaling of classical resources, our algorithm outputs a set of shallow measurement circuits that approximately minimizes the sample complexity of estimating a given set of Pauli strings. We numerically demonstrate systematic improvement, in comparison with state-of-the-art techniques, for energy estimation of quantum chemistry benchmarks and verification of quantum many-body systems, and we observe DSS's performance consistently improves as one allows deeper measurement circuits. These results indicate that in addition to being an efficient, low-depth, stand-alone algorithm, DSS can also benefit many larger quantum algorithms requiring estimation of multiple non-commuting observables.
QUANT-PHMay 10, 2023
Discovery of Optimal Quantum Error Correcting Codes via Reinforcement LearningVincent Paul Su, ChunJun Cao, Hong-Ye Hu et al.
The recently introduced Quantum Lego framework provides a powerful method for generating complex quantum error correcting codes (QECCs) out of simple ones. We gamify this process and unlock a new avenue for code design and discovery using reinforcement learning (RL). One benefit of RL is that we can specify \textit{arbitrary} properties of the code to be optimized. We train on two such properties, maximizing the code distance, and minimizing the probability of logical error under biased Pauli noise. For the first, we show that the trained agent identifies ways to increase code distance beyond naive concatenation, saturating the linear programming bound for CSS codes on 13 qubits. With a learning objective to minimize the logical error probability under biased Pauli noise, we find the best known CSS code at this task for $\lesssim 20$ qubits. Compared to other (locally deformed) CSS codes, including Surface, XZZX, and 2D Color codes, our $[[17,1,3]]$ code construction actually has \textit{lower} adversarial distance, yet better protects the logical information, highlighting the importance of QECC desiderata. Lastly, we comment on how this RL framework can be used in conjunction with physical quantum devices to tailor a code without explicit characterization of the noise model.
QUANT-PHOct 8, 2021
Differentiable Programming of Isometric Tensor NetworksChenhua Geng, Hong-Ye Hu, Yijian Zou
Differentiable programming is a new programming paradigm which enables large scale optimization through automatic calculation of gradients also known as auto-differentiation. This concept emerges from deep learning, and has also been generalized to tensor network optimizations. Here, we extend the differentiable programming to tensor networks with isometric constraints with applications to multiscale entanglement renormalization ansatz (MERA) and tensor network renormalization (TNR). By introducing several gradient-based optimization methods for the isometric tensor network and comparing with Evenbly-Vidal method, we show that auto-differentiation has a better performance for both stability and accuracy. We numerically tested our methods on 1D critical quantum Ising spin chain and 2D classical Ising model. We calculate the ground state energy for the 1D quantum model and internal energy for the classical model, and scaling dimensions of scaling operators and find they all agree with the theory well.
QUANT-PHFeb 19, 2021
Hamiltonian-Driven Shadow Tomography of Quantum StatesHong-Ye Hu, Yi-Zhuang You
Classical shadow tomography provides an efficient method for predicting functions of an unknown quantum state from a few measurements of the state. It relies on a unitary channel that efficiently scrambles the quantum information of the state to the measurement basis. Facing the challenge of realizing deep unitary circuits on near-term quantum devices, we explore the scenario in which the unitary channel can be shallow and is generated by a quantum chaotic Hamiltonian via time evolution. We provide an unbiased estimator of the density matrix for all ranges of the evolution time. We analyze the sample complexity of the Hamiltonian-driven shadow tomography. For Pauli observables, we find that it can be more efficient than the unitary-2-design-based shadow tomography in a sequence of intermediate time windows that range from an order-1 scrambling time to a time scale of $D^{1/6}$, given the Hilbert space dimension $D$. In particular, the efficiency of predicting diagonal Pauli observables is improved by a factor of $D$ without sacrificing the efficiency of predicting off-diagonal Pauli observables.
LGSep 30, 2020
RG-Flow: A hierarchical and explainable flow model based on renormalization group and sparse priorHong-Ye Hu, Dian Wu, Yi-Zhuang You et al.
Flow-based generative models have become an important class of unsupervised learning approaches. In this work, we incorporate the key ideas of renormalization group (RG) and sparse prior distribution to design a hierarchical flow-based generative model, RG-Flow, which can separate information at different scales of images and extract disentangled representations at each scale. We demonstrate our method on synthetic multi-scale image datasets and the CelebA dataset, showing that the disentangled representations enable semantic manipulation and style mixing of the images at different scales. To visualize the latent representations, we introduce receptive fields for flow-based models and show that the receptive fields of RG-Flow are similar to those of convolutional neural networks. In addition, we replace the widely adopted isotropic Gaussian prior distribution by the sparse Laplacian distribution to further enhance the disentanglement of representations. From a theoretical perspective, our proposed method has $O(\log L)$ complexity for inpainting of an image with edge length $L$, compared to previous generative models with $O(L^2)$ complexity.