Xuefei Wu

IT
h-index7
3papers
4citations
Novelty48%
AI Score41

3 Papers

83.7ITApr 13
Generalized Roth--Lempel Codes: NMDS Characterization, Hermitian Self-Orthogonality, and Quantum Constructions

Qi Liu, Xuefei Wu, Haiyan Zhou

In their seminal 1989 work (IEEE Trans. Inf. Theory 35(3):655-657), Roth and Lempel constructed a well-known family of non-Reed-Solomon maximum distance separable (MDS) codes. For decades, this family of codes has attracted extensive research attention due to its algebraic structure, low-complexity decoding, and broad applications in cryptography and data storage. Most recently, in 2025, the generalized Roth-Lempel (GRL) framework unifies Roth-Lempel codes and its extensions under a flexible algebraic structure. However, explicit criteria for the near-MDS (NMDS) property of GRL codes have not been established, and no systematic construction of Hermitian self-orthogonal GRL codes has been reported, limiting their deployment in classical and quantum error correction. In this work, we make three contributions to address these gaps. First, we give explicit necessary and sufficient conditions for the NMDS property of the two most widely used subclasses of GRL codes. Second, we construct four new families of Hermitian self-orthogonal codes from GRL codes. Two of these families are NMDS, with parameters not covered by existing Hermitian self-orthogonal NMDS codes. Third, based on the proposed Hermitian self-orthogonal GRL codes, we construct four families of quantum GRL codes, including two infinite families of quantum NMDS codes that attain the quantum Singleton bound minus one. Compared to the known quantum error-correcting codes, we obtain many new or improved quantum error-correcting codes. This work bridges the gap between classical GRL code families and quantum error-correction applications.

LGJul 25, 2025
Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning via Incorporating Generalized Human Expertise

Xuefei Wu, Xiao Yin, Yuanyang Zhu et al.

Efficient exploration in multi-agent reinforcement learning (MARL) is a challenging problem when receiving only a team reward, especially in environments with sparse rewards. A powerful method to mitigate this issue involves crafting dense individual rewards to guide the agents toward efficient exploration. However, individual rewards generally rely on manually engineered shaping-reward functions that lack high-order intelligence, thus it behaves ineffectively than humans regarding learning and generalization in complex problems. To tackle these issues, we combine the above two paradigms and propose a novel framework, LIGHT (Learning Individual Intrinsic reward via Incorporating Generalized Human experTise), which can integrate human knowledge into MARL algorithms in an end-to-end manner. LIGHT guides each agent to avoid unnecessary exploration by considering both individual action distribution and human expertise preference distribution. Then, LIGHT designs individual intrinsic rewards for each agent based on actionable representational transformation relevant to Q-learning so that the agents align their action preferences with the human expertise while maximizing the joint action value. Experimental results demonstrate the superiority of our method over representative baselines regarding performance and better knowledge reusability across different sparse-reward tasks on challenging scenarios.

MAOct 23, 2025
High-order Interactions Modeling for Interpretable Multi-Agent Q-Learning

Qinyu Xu, Yuanyang Zhu, Xuefei Wu et al.

The ability to model interactions among agents is crucial for effective coordination and understanding their cooperation mechanisms in multi-agent reinforcement learning (MARL). However, previous efforts to model high-order interactions have been primarily hindered by the combinatorial explosion or the opaque nature of their black-box network structures. In this paper, we propose a novel value decomposition framework, called Continued Fraction Q-Learning (QCoFr), which can flexibly capture arbitrary-order agent interactions with only linear complexity $\mathcal{O}\left({n}\right)$ in the number of agents, thus avoiding the combinatorial explosion when modeling rich cooperation. Furthermore, we introduce the variational information bottleneck to extract latent information for estimating credits. This latent information helps agents filter out noisy interactions, thereby significantly enhancing both cooperation and interpretability. Extensive experiments demonstrate that QCoFr not only consistently achieves better performance but also provides interpretability that aligns with our theoretical analysis.