QUANT-PHMay 26, 2022
Mitigating barren plateaus of variational quantum eigensolversXia Liu, Geng Liu, Jiaxin Huang et al.
Variational quantum algorithms (VQAs) are expected to establish valuable applications on near-term quantum computers. However, recent works have pointed out that the performance of VQAs greatly relies on the expressibility of the ansatzes and is seriously limited by optimization issues such as barren plateaus (i.e., vanishing gradients). This work proposes the state efficient ansatz (SEA) for accurate ground state preparation with improved trainability. We show that the SEA can generate an arbitrary pure state with much fewer parameters than a universal ansatz, making it efficient for tasks like ground state estimation. Then, we prove that barren plateaus can be efficiently mitigated by the SEA and the trainability can be further improved most quadratically by flexibly adjusting the entangling capability of the SEA. Finally, we investigate a plethora of examples in ground state estimation where we obtain significant improvements in the magnitude of cost gradient and the convergence speed.
SEAug 7, 2024
RepoMasterEval: Evaluating Code Completion via Real-World RepositoriesQinyun Wu, Chao Peng, Pengfei Gao et al.
With the growing reliance on automated code completion tools in software development, the need for comprehensive evaluation benchmarks has become critical. Existing benchmarks focus more on code completion in function and class level by providing text descriptions to prompt the model. By contrast, such descriptive prompt is commonly unavailable in real development and code completion can occur in wider range of situations such as in the middle of a function or a code block. These limitations makes existing evaluation benchmarks poorly align with the practical scenarios of code completion tools. In this paper, we propose RepoMasterEval, a novel benchmark for evaluating code completion models constructed from real-world repositories. Each benchmark datum is generated by masking a code snippet (ground truth) from one source code file with existing test suites. To improve test accuracy of model generated code, we employ mutation testing to measure the effectiveness of the test cases and we manually crafted new test cases for those test suites with low mutation score. Our empirical evaluation on 10 state-of-the-art models shows that test argumentation is critical in improving the accuracy of the benchmark and RepoMasterEval is able to report variance in model performance in real-world scenarios. The deployment of RepoMasterEval also revealed that the benchmark is useful to give accurate feedback during model training and the score is in high correlation with the model's performance in practice.
SEJul 31, 2025Code
Trae Agent: An LLM-based Agent for Software Engineering with Test-time ScalingTrae Research Team, Pengfei Gao, Zhao Tian et al. · pku
Software issue resolution is a critical challenge in software engineering and has garnered increasing attention in recent years. With the rapid advancement of large language models (LLMs), substantial progress has been made in addressing real-world software engineering tasks. Recent studies have introduced ensemble reasoning techniques to enhance the performance of LLM-based issue resolution. However, existing prompting-based methods still face limitations in effectively exploring large ensemble spaces and lack the capacity for repository-level understanding, both of which constrain their overall effectiveness. In this paper, we propose Trae Agent, the first agent-based ensemble reasoning approach for repository-level issue resolution. Trae Agent formulates our goal as an optimal solution search problem and addresses two key challenges, i.e., large ensemble spaces and repository-level understanding, through modular agents for generation, pruning, and selection. We conduct extensive experiments using three leading LLMs on the widely-adopted SWE-bench benchmark, comparing Trae Agent against four state-of-the-art ensemble reasoning techniques. Experimental results demonstrate that Trae Agent consistently achieves superior performance, with an average improvement of 10.22% over all baselines in terms of Pass@1. Trae Agent has achieved first place on the SWE-bench Verified leaderboard, with a notable Pass@1 score of 75.20%. We are pleased to release Trae Agent as an open-source project to support the research community, with all resources available at https://github.com/bytedance/trae-agent.
CLAug 16, 2025
Learning Wisdom from Errors: Promoting LLM's Continual Relation Learning through Exploiting Error CasesShaozhe Yin, Jinyu Guo, Kai Shuang et al.
Continual Relation Extraction (CRE) aims to continually learn new emerging relations while avoiding catastrophic forgetting. Existing CRE methods mainly use memory replay and contrastive learning to mitigate catastrophic forgetting. However, these methods do not attach importance to the error cases that can reveal the model's cognitive biases more effectively. To address this issue, we propose an instruction-based continual contrastive tuning approach for Large Language Models (LLMs) in CRE. Different from existing CRE methods that typically handle the training and memory data in a unified manner, this approach splits the training and memory data of each task into two parts respectively based on the correctness of the initial responses and treats them differently through dual-task fine-tuning. In addition, leveraging the advantages of LLM's instruction-following ability, we propose a novel instruction-based contrastive tuning strategy for LLM to continuously correct current cognitive biases with the guidance of previous data in an instruction-tuning manner, which mitigates the gap between old and new relations in a more suitable way for LLMs. We experimentally evaluate our model on TACRED and FewRel, and the results show that our model achieves new state-of-the-art CRE performance with significant improvements, demonstrating the importance of specializing in exploiting error cases.
CRSep 26, 2021
Quantum Identity-Based Encryption from the Learning with Errors ProblemWenhua Gao, Li Yang, DaoDe Zhang et al.
In order to prevent eavesdropping and tampering, the network security protocols use a handshake with an asymmetric cipher to establish a session-specific shared key with which further communication is encrypted using a symmetric cipher. The commonly used asymmetric algorithms include public key encryption, key exchange and identity-based encryption(IBE). However, the network security protocols based on classic identity-based encryption do not have perfect forward security. To solve the problem, we construct the first quantum IBE (QIBE) scheme based on the learning with errors problem, and prove that our scheme is fully secure under the random oracle. Moreover, we construct the quantum circuit of our QIBE scheme and give an estimate of the quantum resource of our circuit including the numbers of Hadamard gate, phase gate, T gate, CNOT gate and the total qubits used in the circuit, and conclude that the quantum resources required by our scheme increase linearly with the number of bits of the encrypted quantum plaintext. Our scheme exhibits the following advantages: (i) The classic key generation center (KGC) system still can be used for our QIBE scheme to generate and distribute the secret identity keys so that the cost can be reduced when the scheme is implemented. The reason why the classic KGC can be used is that the public and private keys are in the form of classic bits. (ii) The network security protocols using a handshake with our QIBE scheme can provide perfect forward security. In our scheme, the ciphertext is transmitted in the form of a quantum state that is unknown to the adversary and therefore cannot be copied and stored. Thus, in the network security protocols based on our QIBE construction, the adversary cannot decrypt the previous quantum ciphertext to threat the previous session keys even if the identity secret key is threatened.
ITJan 13, 2020
Approximation smooth and sparse functions by deep neural networks without saturationXia Liu
Constructing neural networks for function approximation is a classical and longstanding topic in approximation theory. In this paper, we aim at constructing deep neural networks (deep nets for short) with three hidden layers to approximate smooth and sparse functions. In particular, we prove that the constructed deep nets can reach the optimal approximation rate in approximating both smooth and sparse functions with controllable magnitude of free parameters. Since the saturation that describes the bottleneck of approximate is an insurmountable problem of constructive neural networks, we also prove that deepening the neural network with only one more hidden layer can avoid the saturation. The obtained results underlie advantages of deep nets and provide theoretical explanations for deep learning.
LGApr 20, 2016
Greedy Criterion in Orthogonal Greedy LearningLin Xu, Shaobo Lin, Jinshan Zeng et al.
Orthogonal greedy learning (OGL) is a stepwise learning scheme that starts with selecting a new atom from a specified dictionary via the steepest gradient descent (SGD) and then builds the estimator through orthogonal projection. In this paper, we find that SGD is not the unique greedy criterion and introduce a new greedy criterion, called "$δ$-greedy threshold" for learning. Based on the new greedy criterion, we derive an adaptive termination rule for OGL. Our theoretical study shows that the new learning scheme can achieve the existing (almost) optimal learning rate of OGL. Plenty of numerical experiments are provided to support that the new scheme can achieve almost optimal generalization performance, while requiring less computation than OGL.
LGJan 24, 2014
Is Extreme Learning Machine Feasible? A Theoretical Assessment (Part II)Shaobo Lin, Xia Liu, Jian Fang et al.
An extreme learning machine (ELM) can be regarded as a two stage feed-forward neural network (FNN) learning system which randomly assigns the connections with and within hidden neurons in the first stage and tunes the connections with output neurons in the second stage. Therefore, ELM training is essentially a linear learning problem, which significantly reduces the computational burden. Numerous applications show that such a computation burden reduction does not degrade the generalization capability. It has, however, been open that whether this is true in theory. The aim of our work is to study the theoretical feasibility of ELM by analyzing the pros and cons of ELM. In the previous part on this topic, we pointed out that via appropriate selection of the activation function, ELM does not degrade the generalization capability in the expectation sense. In this paper, we launch the study in a different direction and show that the randomness of ELM also leads to certain negative consequences. On one hand, we find that the randomness causes an additional uncertainty problem of ELM, both in approximation and learning. On the other hand, we theoretically justify that there also exists an activation function such that the corresponding ELM degrades the generalization capability. In particular, we prove that the generalization capability of ELM with Gaussian kernel is essentially worse than that of FNN with Gaussian kernel. To facilitate the use of ELM, we also provide a remedy to such a degradation. We find that the well-developed coefficient regularization technique can essentially improve the generalization capability. The obtained results reveal the essential characteristic of ELM and give theoretical guidance concerning how to use ELM.