Zeyi Tao

QUANT-PH

h-index10

4papers

732citations

Novelty50%

AI Score32

Ranked #127,338 of 194,257 authors (top 66%)#586 in QUANT-PH (top 62%)

4 Papers

5.9QUANT-PHAug 4, 2022Code

Scalable Quantum Neural Networks for Classification

Jindi Wu, Zeyi Tao, Qun Li

Many recent machine learning tasks resort to quantum computing to improve classification accuracy and training efficiency by taking advantage of quantum mechanics, known as quantum machine learning (QML). The variational quantum circuit (VQC) is frequently utilized to build a quantum neural network (QNN), which is a counterpart to the conventional neural network. Due to hardware limitations, however, current quantum devices only allow one to use few qubits to represent data and perform simple quantum computations. The limited quantum resource on a single quantum device degrades the data usage and limits the scale of the quantum circuits, preventing quantum advantage to some extent. To alleviate this constraint, we propose an approach to implementing a scalable quantum neural network (SQNN) by utilizing the quantum resource of multiple small-size quantum devices cooperatively. In an SQNN system, several quantum devices are used as quantum feature extractors, extracting local features from an input instance in parallel, and a quantum device works as a quantum predictor, performing prediction over the local features collected through classical communication channels. The quantum feature extractors in the SQNN system are independent of each other, so one can flexibly use quantum devices of varying sizes, with larger quantum devices extracting more local features. Especially, the SQNN can be performed on a single quantum device in a modular fashion. Our work is exploratory and carried out on a quantum system simulator using the TensorFlow Quantum library. The evaluation conducts a binary classification on the MNIST dataset. It shows that the SQNN model achieves a comparable classification accuracy to a regular QNN model of the same scale. Furthermore, it demonstrates that the SQNN model with more quantum resources can significantly improve classification accuracy.

4.3QUANT-PHMay 5, 2022Code

LAWS: Look Around and Warm-Start Natural Gradient Descent for Quantum Neural Networks

Zeyi Tao, Jindi Wu, Qi Xia et al.

Variational quantum algorithms (VQAs) have recently received significant attention from the research community due to their promising performance in Noisy Intermediate-Scale Quantum computers (NISQ). However, VQAs run on parameterized quantum circuits (PQC) with randomly initialized parameters are characterized by barren plateaus (BP) where the gradient vanishes exponentially in the number of qubits. In this paper, we first review quantum natural gradient (QNG), which is one of the most popular algorithms used in VQA, from the classical first-order optimization point of view. Then, we proposed a \underline{L}ook \underline{A}round \underline{W}arm-\underline{S}tart QNG (LAWS) algorithm to mitigate the widespread existing BP issues. LAWS is a combinatorial optimization strategy taking advantage of model parameter initialization and fast convergence of QNG. LAWS repeatedly reinitializes parameter search space for the next iteration parameter update. The reinitialized parameter search space is carefully chosen by sampling the gradient close to the current optimal. Moreover, we present a unified framework (WS-SGD) for integrating parameter initialization techniques into the optimizer. We provide the convergence proof of the proposed framework for both convex and non-convex objective functions based on Polyak-Lojasiewicz (PL) condition. Our experiment results show that the proposed algorithm could mitigate the BP and have better generalization ability in quantum classification problems.

2.0LGSep 20, 2023

Preconditioned Federated Learning

Zeyi Tao, Jindi Wu, Qun Li

Federated Learning (FL) is a distributed machine learning approach that enables model training in communication efficient and privacy-preserving manner. The standard optimization method in FL is Federated Averaging (FedAvg), which performs multiple local SGD steps between communication rounds. FedAvg has been considered to lack algorithm adaptivity compared to modern first-order adaptive optimizations. In this paper, we propose new communication-efficient FL algortithms based on two adaptive frameworks: local adaptivity (PreFed) and server-side adaptivity (PreFedOp). Proposed methods adopt adaptivity by using a novel covariance matrix preconditioner. Theoretically, we provide convergence guarantees for our algorithms. The empirical experiments show our methods achieve state-of-the-art performances on both i.i.d. and non-i.i.d. settings.

8.7LGJun 5

TALAN: Task-Aligned Latent Adaptation Networks for Targeted Post-Training of Large Language Models

Chengkai Zhang, Ziteng Liu, Junpu Wang et al.

Targeted post-training aims to improve reasoning, math, and code without degrading strengths. Low-rank adapters are efficient but task-global; activation interventions are input-aware but often require separate probes, vectors, or inference-time steering. We introduce TALAN (Task-Aligned Latent Adaptation Networks), a sequence-conditioned latent side path inserted into a transformer's residual stream and co-trained with a low-rank adapter in one SFT loop. TALAN compresses the active sequence into latent memory, remixes it into token-level perturbations, and writes them back through a controlled residual update. It is configured along six axes: insertion location, memory size, mixer, writeback rule, trainability scope, and gradient scale. Across four Qwen3-family backbones and four STEM/code benchmarks, TALAN improves matched LoRA and DoRA baselines. With LoRA, it yields a +1.41 point cross-model mean gain, positive on all four backbones and non-negative on all 16 model-benchmark cells. With DoRA, it yields a +1.85 point mean gain, positive on all backbones and on 13 of 16 cells. Paired seed checks support positive average effects but show nontrivial variance, so we treat them as sensitivity checks. Cost is small: <1% trainable parameters relative to the backbone and 1.01-1.02x inference overhead versus matched LoRA. A Llama-3.2-1B transfer probe is also positive under LoRA and rsLoRA across seven paired seeds, supporting a transfer beyond Qwen. Internal-state analyses suggest TALAN is a small complementary activation intervention. The matched adapter update is 80-1,700x larger than the TALAN perturbation, yet their directions have near-zero cosine; per-layer measurements show this small orthogonal perturbation propagates and amplifies through depth. TALAN offers a practical platform for studying steerable activation-level adaptation within standard adapter-based post-training.