11.3ROMay 27
S-Cheetah: A Novel Quadrupedal Robot with a 3-DOF Active Spine Learning Agile LocomotionZimu Li, Weibang Bai
The biological spine of quadrupeds enables sagittal flexion/extension, lateral bending, and axial rotation, playing a crucial role in highly agile and dexterous locomotion. While numerous studies have integrated active spinal joints into quadrupedal robots to enhance agility, most designs simplify control complexity by reducing spinal degrees of freedom (DOF), failing to achieve the spatial tri-axial rotation characteristic of biological spines. Consequently, replicating a multi-DOF biomimetic spine and effectively leveraging it to empower the agile locomotion of quadrupedal robots remains a significant research challenge. In this study, we present S-Cheetah, a quadrupedal robot featuring a 3-DOF bio-inspired serial active spine capable of biomimetic spatial tri-axial rotation. To empower the robot to fully utilize this active spine, we developed a specialized reinforcement learning framework to actively promote the engagement of the introduced spine and maximize the robot's locomotive capabilities by integrating an acceleration curriculum learning strategy with tailored reward functions, such as a gallop gait reward, a spine undulation reward, and a spine steering reward. Experimental results demonstrate that S-Cheetah can achieve a peak speed of 6.9 m/s using the rotary G2 gallop gait and an in-place turning rate of 7.2 rad/s. Besides, the system exhibits an emergent, feline-inspired aerial self-righting capability, allowing it to land stably on four feet from arbitrary orientations during free fall. Finally, through extensive evaluations across diverse locomotion tasks, we prove that the introduction of the proposed 3-DOF spine comprehensively enhances the locomotive agility of quadrupedal robots. Project website: himmy-robotics.github.io/scheetah
LGJun 5, 2023
R-Mixup: Riemannian Mixup for Biological NetworksXuan Kan, Zimu Li, Hejie Cui et al.
Biological networks are commonly used in biomedical and healthcare domains to effectively model the structure of complex biological systems with interactions linking biological entities. However, due to their characteristics of high dimensionality and low sample size, directly applying deep learning models on biological networks usually faces severe overfitting. In this work, we propose R-MIXUP, a Mixup-based data augmentation technique that suits the symmetric positive definite (SPD) property of adjacency matrices from biological networks with optimized training efficiency. The interpolation process in R-MIXUP leverages the log-Euclidean distance metrics from the Riemannian manifold, effectively addressing the swelling effect and arbitrarily incorrect label issues of vanilla Mixup. We demonstrate the effectiveness of R-MIXUP with five real-world biological network datasets on both regression and classification tasks. Besides, we derive a commonly ignored necessary condition for identifying the SPD matrices of biological networks and empirically study its influence on the model performance. The code implementation can be found in Appendix E.
QUANT-PHJul 15, 2022
Toward Super-polynomial Quantum Speedup of Equivariant Quantum Algorithms with SU($d$) SymmetryHan Zheng, Zimu Li, Sergii Strelchuk et al.
We introduce a framework of the equivariant convolutional quantum algorithms which is tailored for a number of machine-learning tasks on physical systems with arbitrary SU$(d)$ symmetries. It allows us to enhance a natural model of quantum computation -- permutational quantum computing (PQC) -- and define a more powerful model: PQC+. While PQC was shown to be efficiently classically simulatable, we exhibit a problem which can be efficiently solved on PQC+ machine, whereas no classical polynomial time algorithm is known; thus providing evidence against PQC+ being classically simulatable. We further discuss practical quantum machine learning algorithms which can be carried out in the paradigm of PQC+.
LGOct 2, 2023
Transformers are efficient hierarchical chemical graph learnersZihan Pengmei, Zimu Li, Chih-chan Tien et al.
Transformers, adapted from natural language processing, are emerging as a leading approach for graph representation learning. Contemporary graph transformers often treat nodes or edges as separate tokens. This approach leads to computational challenges for even moderately-sized graphs due to the quadratic scaling of self-attention complexity with token count. In this paper, we introduce SubFormer, a graph transformer that operates on subgraphs that aggregate information by a message-passing mechanism. This approach reduces the number of tokens and enhances learning long-range interactions. We demonstrate SubFormer on benchmarks for predicting molecular properties from chemical structures and show that it is competitive with state-of-the-art graph transformers at a fraction of the computational cost, with training times on the order of minutes on a consumer-grade graphics card. We interpret the attention weights in terms of chemical structures. We show that SubFormer exhibits limited over-smoothing and avoids over-squashing, which is prevalent in traditional graph neural networks.
LGNov 14, 2022
Unifying O(3) Equivariant Neural Networks Design with Tensor-Network FormalismZimu Li, Zihan Pengmei, Han Zheng et al.
Many learning tasks, including learning potential energy surfaces from ab initio calculations, involve global spatial symmetries and permutational symmetry between atoms or general particles. Equivariant graph neural networks are a standard approach to such problems, with one of the most successful methods employing tensor products between various tensors that transform under the spatial group. However, as the number of different tensors and the complexity of relationships between them increase, maintaining parsimony and equivariance becomes increasingly challenging. In this paper, we propose using fusion diagrams, a technique widely employed in simulating SU($2$)-symmetric quantum many-body problems, to design new equivariant components for equivariant neural networks. This results in a diagrammatic approach to constructing novel neural network architectures. When applied to particles within a given local neighborhood, the resulting components, which we term "fusion blocks," serve as universal approximators of any continuous equivariant function defined in the neighborhood. We incorporate a fusion block into pre-existing equivariant architectures (Cormorant and MACE), leading to improved performance with fewer parameters on a range of challenging chemical problems. Furthermore, we apply group-equivariant neural networks to study non-adiabatic molecular dynamics of stilbene cis-trans isomerization. Our approach, which combines tensor networks with equivariant neural networks, suggests a potentially fruitful direction for designing more expressive equivariant neural networks.
91.5CVMay 25
TriSplat: Simulation-Ready Feed-Forward 3D Scene ReconstructionWeijie Wang, Zimu Li, Jinchuan Shi et al.
Sparse-view 3D reconstruction is increasingly addressed with feed-forward splatting networks that predict explicit primitives directly from images. Yet most existing methods remain centered on Gaussian primitives and expose surfaces only indirectly: extracting a usable mesh for downstream simulation, physics reasoning, or embodied interaction still requires expensive post-hoc steps that break the feed-forward promise. This limitation is especially pronounced in pose-free settings, where scene structure and camera parameters must be estimated jointly from sparse observations. We present TriSplat, a feed-forward reconstruction network that represents scenes with oriented triangle primitives and directly exports simulation-ready mesh scenes from a single forward pass. Given input images, the network predicts local 3D point maps, triangle attributes, camera poses, and optional intrinsics. Rather than regressing triangle orientation as an unconstrained latent variable, our approach constructs geometry normals from the predicted point maps, refines them with an image-conditioned normal head, and converts them into stable local frames for triangle parameterization. A mono-normal bootstrap schedule further stabilizes early training, while opacity and blur scheduling progressively sharpens the learned surface representation for direct mesh extraction. Experiments on RealEstate10K and DL3DV show that this representation produces more geometry-faithful reconstructions than Gaussian feed-forward baselines while maintaining competitive novel-view rendering quality. Because the rendering primitives are themselves surface triangles, the output can be directly ingested by physics engines, collision detectors, and standard rendering pipelines without any conversion, making it a practical simulation-ready solution for feed-forward 3D scene reconstruction.
99.7QUANT-PHMar 26
Theory of (Co)homological Invariants on Quantum LDPC CodesZimu Li, Yuguo Shao, Fuchuan Wei et al.
With recent breakthroughs in the construction of good qLDPC codes and nearly good qLTCs, the study of (co)homological invariants of quantum code complexes, which fundamentally underlie their logical operations, has become evidently important. In this work, we establish a systematic framework for mathematically analyzing these invariants across a broad spectrum of constructions, from HGP codes to sheaf codes, by synthesizing advanced math tools. We generalize the notion of canonical logical representatives from HGP codes to the sheaf code setting, resolving a long-standing challenge in explicitly characterizing sheaf codewords. Building on this foundation, we present the first comprehensive computation of cup products within the intricate framework of sheaf codes. Given Artin's primitive root conjecture which holds under the generalized Riemann hypothesis, we prove that $\tildeÎ(N)$ independent cup products can be supported on almost good qLDPC codes and qLTCs of length N, opening the possibility of achieving linearly many parallel, nontrivial, constant-depth multi-controlled-Z gates. Moreover, by interpreting sheaf codes as covering spaces of HGP codes via graph lifts, we propose a scheme that inductively generates families of both HGP and sheaf codes in an interlaced fashion from a constant-size HGP code. Notably, the induction preserves all (co)homological invariants of the initial code. This provides a general framework for lifting invariants or logical gates from small codes to infinite code families, and enables efficient verification of such features by checking on small instances. Our theory provides a substantive methodology for studying invariants in HGP codes and extends it to sheaf codes. In doing so, we reveal deep and unexpected connections between qLDPC codes and math, thereby laying the groundwork for future advances in quantum coding, fault tolerance, and physics.
98.8QUANT-PHApr 2
Transversal non-Clifford gates on almost-good quantum LDPC and quantum locally testable codesYiming Li, Zimu Li, Zi-Wen Liu
We exhibit nontrivial transversal logical multi-controlled-$Z$ gates on $[\![N,Î(N),\tildeÎ(N)]\!]$ quantum low-density parity-check codes and $[\![N,Î(N),\tildeÎ(N)]\!]$ quantum locally testable codes with soundness $\tildeÎ(1)$, combining nearly optimal code parameters with fault-tolerant non-Clifford gates for the first time. Remarkably, our proofs are almost entirely algebraic-topological, showing that such presumably intricate logical gates naturally arise as a fundamental topological phenomenon. We develop a general framework for constructing a rich new family of homological invariant forms which we call ''cupcap gates'' that induce transversal logical multi-controlled-$Z$ and, building on insights from [Li et al., arXiv:2603.25831], covering space methods to certify their nontriviality. The claimed almost-good code results follow immediately as examples.
CVApr 25, 2025
Depth3DLane: Monocular 3D Lane Detection via Depth Prior DistillationDongxin Lyu, Han Huang, Cheng Tan et al.
Monocular 3D lane detection is challenging due to the difficulty in capturing depth information from single-camera images. A common strategy involves transforming front-view (FV) images into bird's-eye-view (BEV) space through inverse perspective mapping (IPM), facilitating lane detection using BEV features. However, IPM's flat-ground assumption and loss of contextual information lead to inaccuracies in reconstructing 3D information, especially height. In this paper, we introduce a BEV-based framework to address these limitations and improve 3D lane detection accuracy. Our approach incorporates a Hierarchical Depth-Aware Head that provides multi-scale depth features, mitigating the flat-ground assumption by enhancing spatial awareness across varying depths. Additionally, we leverage Depth Prior Distillation to transfer semantic depth knowledge from a teacher model, capturing richer structural and contextual information for complex lane structures. To further refine lane continuity and ensure smooth lane reconstruction, we introduce a Conditional Random Field module that enforces spatial coherence in lane predictions. Extensive experiments validate that our method achieves state-of-the-art performance in terms of z-axis error and outperforms other methods in the field in overall performance. The code is released at: https://anonymous.4open.science/r/Depth3DLane-DCDD.
LGApr 8, 2024
Technical Report: The Graph Spectral Token -- Enhancing Graph Transformers with Spectral InformationZihan Pengmei, Zimu Li
Graph Transformers have emerged as a powerful alternative to Message-Passing Graph Neural Networks (MP-GNNs) to address limitations such as over-squashing of information exchange. However, incorporating graph inductive bias into transformer architectures remains a significant challenge. In this report, we propose the Graph Spectral Token, a novel approach to directly encode graph spectral information, which captures the global structure of the graph, into the transformer architecture. By parameterizing the auxiliary [CLS] token and leaving other tokens representing graph nodes, our method seamlessly integrates spectral information into the learning process. We benchmark the effectiveness of our approach by enhancing two existing graph transformers, GraphTrans and SubFormer. The improved GraphTrans, dubbed GraphTrans-Spec, achieves over 10% improvements on large graph benchmark datasets while maintaining efficiency comparable to MP-GNNs. SubFormer-Spec demonstrates strong performance across various datasets.
QUANT-PHDec 14, 2021
Speeding up Learning Quantum States through Group Equivariant Convolutional Quantum AnsätzeHan Zheng, Zimu Li, Junyu Liu et al.
We develop a theoretical framework for $S_n$-equivariant convolutional quantum circuits with SU$(d)$-symmetry, building on and significantly generalizing Jordan's Permutational Quantum Computing (PQC) formalism based on Schur-Weyl duality connecting both SU$(d)$ and $S_n$ actions on qudits. In particular, we utilize the Okounkov-Vershik approach to prove Harrow's statement (Ph.D. Thesis 2005 p.160) on the equivalence between $\operatorname{SU}(d)$ and $S_n$ irrep bases and to establish the $S_n$-equivariant Convolutional Quantum Alternating Ansätze ($S_n$-CQA) using Young-Jucys-Murphy (YJM) elements. We prove that $S_n$-CQA is able to generate any unitary in any given $S_n$ irrep sector, which may serve as a universal model for a wide array of quantum machine learning problems with the presence of SU($d$) symmetry. Our method provides another way to prove the universality of Quantum Approximate Optimization Algorithm (QAOA) and verifies that 4-local SU($d$) symmetric unitaries are sufficient to build generic SU($d$) symmetric quantum circuits up to relative phase factors. We present numerical simulations to showcase the effectiveness of the ansätze to find the ground state energy of the $J_1$--$J_2$ antiferromagnetic Heisenberg model on the rectangular and Kagome lattices. Our work provides the first application of the celebrated Okounkov-Vershik's $S_n$ representation theory to quantum physics and machine learning, from which to propose quantum variational ansätze that strongly suggests to be classically intractable tailored towards a specific optimization problem.
QUANT-PHSep 12, 2021
Towards a variational Jordan-Lee-Preskill quantum algorithmJunyu Liu, Zimu Li, Han Zheng et al.
Rapid developments of quantum information technology show promising opportunities for simulating quantum field theory in near-term quantum devices. In this work, we formulate the theory of (time-dependent) variational quantum simulation of the 1+1 dimensional $λφ^4$ quantum field theory including encoding, state preparation, and time evolution, with several numerical simulation results. These algorithms could be understood as near-term variational quantum circuit (quantum neural network) analogs of the Jordan-Lee-Preskill algorithm, the basic algorithm for simulating quantum field theory using universal quantum devices. Besides, we highlight the advantages of encoding with harmonic oscillator basis based on the LSZ reduction formula and several computational efficiency such as when implementing a bosonic version of the unitary coupled cluster ansatz to prepare initial states. We also discuss how to circumvent the "spectral crowding" problem in the quantum field theory simulation and appraise our algorithm by both state and subspace fidelities.