Rebing Wu

h-index3
2papers

2 Papers

QUANT-PHMay 24, 2019
A gradient algorithm for Hamiltonian identification of open quantum systems

Shibei Xue, Rebing Wu, Dewei Li et al.

In this paper, we present a gradient algorithm for identifying unknown parameters in an open quantum system from the measurements of time traces of local observables. The open system dynamics is described by a general Markovian master equation based on which the Hamiltonian identification problem can be formulated as minimizing the distance between the real time traces of the observables and those predicted by the master equation. The unknown parameters can then be learned with a gradient descent algorithm from the measurement data. We verify the effectiveness of our algorithm in a circuit QED system described by a Jaynes-Cumming model whose Hamiltonian identification has been rarely considered. We also show that our gradient algorithm can learn the spectrum of a non-Markovian environment based on an augmented system model.

QUANT-PHNov 15, 2025
Reinforcement Learning for Charging Optimization of Inhomogeneous Dicke Quantum Batteries

Xiaobin Song, Siyuan Bai, Da-Wei Wang et al.

Charging optimization is a key challenge to the implementation of quantum batteries, particularly under inhomogeneity and partial observability. This paper employs reinforcement learning to optimize piecewise-constant charging policies for an inhomogeneous Dicke battery. We systematically compare policies across four observability regimes, from full-state access to experimentally accessible observables (energies of individual two-level systems (TLSs), first-order averages, and second-order correlations). Simulation results demonstrate that full observability yields near-optimal ergotropy with low variability, while under partial observability, access to only single-TLS energies or energies plus first-order averages lags behind the fully observed baseline. However, augmenting partial observations with second-order correlations recovers most of the gap, reaching 94%-98% of the full-state baseline. The learned schedules are nonmyopic, trading temporary plateaus or declines for superior terminal outcomes. These findings highlight a practical route to effective fast-charging protocols under realistic information constraints.