Yan Zhu

h-index18

9papers

445citations

Novelty58%

AI Score35

Ranked #104,073 of 194,257 authors (top 54%)#22,909 in LG (top 57%)

9 Papers

8.6QUANT-PHNov 3, 2022

Ya-Dong Wu, Yan Zhu, Ge Bai et al.

The task of testing whether two uncharacterized quantum devices behave in the same way is crucial for benchmarking near-term quantum computers and quantum simulators, but has so far remained open for continuous-variable quantum systems. In this Letter, we develop a machine learning algorithm for comparing unknown continuous variable states using limited and noisy data. The algorithm works on non-Gaussian quantum states for which similarity testing could not be achieved with previous techniques. Our approach is based on a convolutional neural network that assesses the similarity of quantum states based on a lower-dimensional state representation built from measurement data. The network can be trained offline with classically simulated data from a fiducial set of states sharing structural similarities with the states to be tested, or with experimental data generated by measurements on the fiducial states, or with a combination of simulated and experimental data. We test the performance of the model on noisy cat states and states generated by arbitrary selective number-dependent phase gates. Our network can also be applied to the problem of comparing continuous variable states across different experimental platforms, with different sets of achievable measurements, and to the problem of experimentally testing whether two states are equivalent up to Gaussian unitary transformations.

5.1QUANT-PHNov 3, 2023Code

Noise-Agnostic Quantum Error Mitigation with Data Augmented Neural Models

Manwen Liao, Yan Zhu, Giulio Chiribella et al.

Quantum error mitigation, a data processing technique for recovering the statistics of target processes from their noisy version, is a crucial task for near-term quantum technologies. Most existing methods require prior knowledge of the noise model or the noise parameters. Deep neural networks have a potential to lift this requirement, but current models require training data produced by ideal processes in the absence of noise. Here we build a neural model that achieves quantum error mitigation without any prior knowledge of the noise and without training on noise-free data. To achieve this feature, we introduce a quantum augmentation technique for error mitigation. Our approach applies to quantum circuits and to the dynamics of many-body and continuous-variable quantum systems, accommodating various types of noise models. We demonstrate its effectiveness by testing it both on simulated noisy circuits and on real quantum hardware.

16.8LGNov 19, 2015Code

Better Computer Go Player with Neural Network and Long-term Prediction

Yuandong Tian, Yan Zhu

Competing with top human players in the ancient game of Go has been a long-term goal of artificial intelligence. Go's high branching factor makes traditional search techniques ineffective, even on leading-edge hardware, and Go's evaluation function could change drastically with one stone change. Recent works [Maddison et al. (2015); Clark & Storkey (2015)] show that search is not strictly necessary for machine Go players. A pure pattern-matching approach, based on a Deep Convolutional Neural Network (DCNN) that predicts the next move, can perform as well as Monte Carlo Tree Search (MCTS)-based open source Go engines such as Pachi [Baudis & Gailly (2012)] if its search budget is limited. We extend this idea in our bot named darkforest, which relies on a DCNN designed for long-term predictions. Darkforest substantially improves the win rate for pattern-matching approaches against MCTS-based approaches, even with looser search budgets. Against human players, the newest versions, darkfores2, achieve a stable 3d level on KGS Go Server as a ranked bot, a substantial improvement upon the estimated 4k-5k ranks for DCNN reported in Clark & Storkey (2015) based on games against other machine players. Adding MCTS to darkfores2 creates a much stronger player named darkfmcts3: with 5000 rollouts, it beats Pachi with 10k rollouts in all 250 games; with 75k rollouts it achieves a stable 5d level in KGS server, on par with state-of-the-art Go AIs (e.g., Zen, DolBaram, CrazyStone) except for AlphaGo [Silver et al. (2016)]; with 110k rollouts, it won the 3rd place in January KGS Go Tournament.

9.6CVMar 1, 2024Code

Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model

Huan Ma, Yan Zhu, Changqing Zhang et al.

Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data. However, these models also display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of ``decision shortcuts'' that hinder their generalization capabilities. In this work, we find that the CLIP model possesses a rich set of features, encompassing both \textit{desired invariant causal features} and \textit{undesired decision shortcuts}. Moreover, the underperformance of CLIP on downstream tasks originates from its inability to effectively utilize pre-trained features in accordance with specific task requirements. To address this challenge, we propose a simple yet effective method, Spurious Feature Eraser (SEraser), to alleviate the decision shortcuts by erasing the spurious features. Specifically, we introduce a test-time prompt tuning paradigm that optimizes a learnable prompt, thereby compelling the model to exploit invariant features while disregarding decision shortcuts during the inference phase. The proposed method effectively alleviates excessive dependence on potentially misleading spurious information. We conduct comparative analysis of the proposed method against various approaches which validates the significant superiority.

17.3LGFeb 28, 2022

RawlsGCN: Towards Rawlsian Difference Principle on Graph Convolutional Network

Jian Kang, Yan Zhu, Yinglong Xia et al.

Graph Convolutional Network (GCN) plays pivotal roles in many real-world applications. Despite the successes of GCN deployment, GCN often exhibits performance disparity with respect to node degrees, resulting in worse predictive accuracy for low-degree nodes. We formulate the problem of mitigating the degree-related performance disparity in GCN from the perspective of the Rawlsian difference principle, which is originated from the theory of distributive justice. Mathematically, we aim to balance the utility between low-degree nodes and high-degree nodes while minimizing the task-specific loss. Specifically, we reveal the root cause of this degree-related unfairness by analyzing the gradients of weight matrices in GCN. Guided by the gradients of weight matrices, we further propose a pre-processing method RawlsGCN-Graph and an in-processing method RawlsGCN-Grad that achieves fair predictive accuracy in low-degree nodes without modification on the GCN architecture or introduction of additional parameters. Extensive experiments on real-world graphs demonstrate the effectiveness of our proposed RawlsGCN methods in significantly reducing degree-related bias while retaining comparable overall performance.

10.8QUANT-PHFeb 14, 2022

Flexible learning of quantum states with generative query neural networks

Yan Zhu, Ya-Dong Wu, Ge Bai et al.

Deep neural networks are a powerful tool for the characterization of quantum states. Existing networks are typically trained with experimental data gathered from the specific quantum state that needs to be characterized. But is it possible to train a neural network offline and to make predictions about quantum states other than the ones used for the training? Here we introduce a model of network that can be trained with classically simulated data from a fiducial set of states and measurements, and can later be used to characterize quantum states that share structural similarities with the states in the fiducial set. With little guidance of quantum physics, the network builds its own data-driven representation of quantum states, and then uses it to predict the outcome statistics of quantum measurements that have not been performed yet. The state representation produced by the network can also be used for tasks beyond the prediction of outcome statistics, including clustering of quantum states and identification of different phases of matter. Our network model provides a flexible approach that can be applied to online learning scenarios, where predictions must be generated as soon as experimental data become available, and to blind learning scenarios where the learner has only access to an encrypted description of the quantum hardware.

6.5LGDec 14, 2021

Efficient Dynamic Graph Representation Learning at Scale

Xinshi Chen, Yan Zhu, Haowen Xu et al.

Dynamic graphs with ordered sequences of events between nodes are prevalent in real-world industrial applications such as e-commerce and social platforms. However, representation learning for dynamic graphs has posed great computational challenges due to the time and structure dependency and irregular nature of the data, preventing such models from being deployed to real-world applications. To tackle this challenge, we propose an efficient algorithm, Efficient Dynamic Graph lEarning (EDGE), which selectively expresses certain temporal dependency via training loss to improve the parallelism in computations. We show that EDGE can scale to dynamic graphs with millions of nodes and hundreds of millions of temporal events and achieve new state-of-the-art (SOTA) performance.

0.8LGNov 5, 2018

Representation Learning by Reconstructing Neighborhoods

Chin-Chia Michael Yeh, Yan Zhu, Evangelos E. Papalexakis et al.

Since its introduction, unsupervised representation learning has attracted a lot of attention from the research community, as it is demonstrated to be highly effective and easy-to-apply in tasks such as dimension reduction, clustering, visualization, information retrieval, and semi-supervised learning. In this work, we propose a novel unsupervised representation learning framework called neighbor-encoder, in which domain knowledge can be easily incorporated into the learning process without modifying the general encoder-decoder architecture of the classic autoencoder.In contrast to autoencoder, which reconstructs the input data itself, neighbor-encoder reconstructs the input data's neighbors. As the proposed representation learning problem is essentially a neighbor reconstruction problem, domain knowledge can be easily incorporated in the form of an appropriate definition of similarity between objects. Based on that observation, our framework can leverage any off-the-shelf similarity search algorithms or side information to find the neighbor of an input object. Applications of other algorithms (e.g., association rule mining) in our framework are also possible, given that the appropriate definition of neighbor can vary in different contexts. We have demonstrated the effectiveness of our framework in many diverse domains, including images, text, and time series, and for various data mining tasks including classification, clustering, and visualization. Experimental results show that neighbor-encoder not only outperforms autoencoder in most of the scenarios we consider, but also achieves the state-of-the-art performance on text document clustering.

28.7CVSep 4, 2015

Semantic Amodal Segmentation

Yan Zhu, Yuandong Tian, Dimitris Mexatas et al.

Common visual recognition tasks such as classification, object detection, and semantic segmentation are rapidly reaching maturity, and given the recent rate of progress, it is not unreasonable to conjecture that techniques for many of these problems will approach human levels of performance in the next few years. In this paper we look to the future: what is the next frontier in visual recognition? We offer one possible answer to this question. We propose a detailed image annotation that captures information beyond the visible pixels and requires complex reasoning about full scene structure. Specifically, we create an amodal segmentation of each image: the full extent of each region is marked, not just the visible pixels. Annotators outline and name all salient regions in the image and specify a partial depth order. The result is a rich scene structure, including visible and occluded portions of each region, figure-ground edge information, semantic labels, and object overlap. We create two datasets for semantic amodal segmentation. First, we label 500 images in the BSDS dataset with multiple annotators per image, allowing us to study the statistics of human annotations. We show that the proposed full scene annotation is surprisingly consistent between annotators, including for regions and edges. Second, we annotate 5000 images from COCO. This larger dataset allows us to explore a number of algorithmic ideas for amodal segmentation and depth ordering. We introduce novel metrics for these tasks, and along with our strong baselines, define concrete new challenges for the community.