Zhuozhuo Tu

QUANT-PH
6papers
186citations
Novelty56%
AI Score28

6 Papers

QUANT-PHJun 6, 2023
Transition Role of Entangled Data in Quantum Machine Learning

Xinbiao Wang, Yuxuan Du, Zhuozhuo Tu et al.

Entanglement serves as the resource to empower quantum computing. Recent progress has highlighted its positive impact on learning quantum dynamics, wherein the integration of entanglement into quantum operations or measurements of quantum machine learning (QML) models leads to substantial reductions in training data size, surpassing a specified prediction error threshold. However, an analytical understanding of how the entanglement degree in data affects model performance remains elusive. In this study, we address this knowledge gap by establishing a quantum no-free-lunch (NFL) theorem for learning quantum dynamics using entangled data. Contrary to previous findings, we prove that the impact of entangled data on prediction error exhibits a dual effect, depending on the number of permitted measurements. With a sufficient number of measurements, increasing the entanglement of training data consistently reduces the prediction error or decreases the required size of the training data to achieve the same prediction error. Conversely, when few measurements are allowed, employing highly entangled data could lead to an increased prediction error. The achieved results provide critical guidance for designing advanced QML protocols, especially for those tailored for execution on early-stage quantum computers with limited access to quantum resources.

QUANT-PHMay 10, 2022
Power of Quantum Generative Learning

Yuxuan Du, Zhuozhuo Tu, Bujiao Wu et al.

The intrinsic probabilistic nature of quantum mechanics invokes endeavors of designing quantum generative learning models (QGLMs). Despite the empirical achievements, the foundations and the potential advantages of QGLMs remain largely obscure. To narrow this knowledge gap, here we explore the generalization property of QGLMs, the capability to extend the model from learned to unknown data. We consider two prototypical QGLMs, quantum circuit Born machines and quantum generative adversarial networks, and explicitly give their generalization bounds. The result identifies superiorities of QGLMs over classical methods when quantum devices can directly access the target distribution and quantum kernels are employed. We further employ these generalization bounds to exhibit potential advantages in quantum state preparation and Hamiltonian learning. Numerical results of QGLMs in loading Gaussian distribution and estimating ground states of parameterized Hamiltonians accord with the theoretical analysis. Our work opens the avenue for quantitatively understanding the power of quantum generative learning models.

CRDec 30, 2021
Few-shot Backdoor Defense Using Shapley Estimation

Jiyang Guan, Zhuozhuo Tu, Ran He et al.

Deep neural networks have achieved impressive performance in a variety of tasks over the last decade, such as autonomous driving, face recognition, and medical diagnosis. However, prior works show that deep neural networks are easily manipulated into specific, attacker-decided behaviors in the inference stage by backdoor attacks which inject malicious small hidden triggers into model training, raising serious security threats. To determine the triggered neurons and protect against backdoor attacks, we exploit Shapley value and develop a new approach called Shapley Pruning (ShapPruning) that successfully mitigates backdoor attacks from models in a data-insufficient situation (1 image per class or even free of data). Considering the interaction between neurons, ShapPruning identifies the few infected neurons (under 1% of all neurons) and manages to protect the model's structure and accuracy after pruning as many infected neurons as possible. To accelerate ShapPruning, we further propose discarding threshold and $ε$-greedy strategy to accelerate Shapley estimation, making it possible to repair poisoned models with only several minutes. Experiments demonstrate the effectiveness and robustness of our method against various attacks and tasks compared to existing methods.

LGDec 12, 2021
Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer

Shiye Lei, Zhuozhuo Tu, Leszek Rutkowski et al.

Bayesian neural networks (BNNs) have become a principal approach to alleviate overconfident predictions in deep learning, but they often suffer from scaling issues due to a large number of distribution parameters. In this paper, we discover that the first layer of a deep network possesses multiple disparate optima when solely retrained. This indicates a large posterior variance when the first layer is altered by a Bayesian layer, which motivates us to design a spatial-temporal-fusion BNN (STF-BNN) for efficiently scaling BNNs to large models: (1) first normally train a neural network from scratch to realize fast training; and (2) the first layer is converted to Bayesian and inferred by employing stochastic variational inference, while other layers are fixed. Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently. We further provide theoretical guarantees on the generalizability and the capability of mitigating overconfidence of STF-BNN. Comprehensive experiments demonstrate that STF-BNN (1) achieves the state-of-the-art performance on prediction and uncertainty quantification; (2) significantly improves adversarial robustness and privacy preservation; and (3) considerably reduces training time and memory costs.

CVNov 18, 2020
Stretchable Cells Help DARTS Search Better

Tao Huang, Shan You, Yibo Yang et al.

Differentiable neural architecture search (DARTS) has gained much success in discovering flexible and diverse cell types. To reduce the evaluation gap, the supernet is expected to have identical layers with the target network. However, even for this consistent search, the searched cells often suffer from poor performance, especially for the supernet with fewer layers, as current DARTS methods are prone to wide and shallow cells, and this topology collapse induces sub-optimal searched cells. In this paper, we alleviate this issue by endowing the cells with explicit stretchability, so the search can be directly implemented on our stretchable cells for both operation type and topology simultaneously. Concretely, we introduce a set of topological variables and a combinatorial probabilistic distribution to explicitly model the target topology. With more diverse and complex topologies, our method adapts well for various layer numbers. Extensive experiments on CIFAR-10 and ImageNet show that our stretchable cells obtain better performance with fewer layers and parameters. For example, our method can improve DARTS by 0.28\% accuracy on CIFAR-10 dataset with 45\% parameters reduced or 2.9\% with similar FLOPs on ImageNet dataset.

MLNov 13, 2018
Theoretical Analysis of Adversarial Learning: A Minimax Approach

Zhuozhuo Tu, Jingwei Zhang, Dacheng Tao

Here we propose a general theoretical method for analyzing the risk bound in the presence of adversaries. Specifically, we try to fit the adversarial learning problem into the minimax framework. We first show that the original adversarial learning problem can be reduced to a minimax statistical learning problem by introducing a transport map between distributions. Then, we prove a new risk bound for this minimax problem in terms of covering numbers under a weak version of Lipschitz condition. Our method can be applied to multi-class classification problems and commonly used loss functions such as the hinge and ramp losses. As some illustrative examples, we derive the adversarial risk bounds for SVMs, deep neural networks, and PCA, and our bounds have two data-dependent terms, which can be optimized for achieving adversarial robustness.