Model-Free Learning for the Linear Quadratic Regulator over Rate-Limited Channels
This work addresses communication constraints in networked control systems, bridging model-free control and compressed optimization, but it is incremental as it builds on existing policy gradient and quantization methods.
The paper tackles the problem of model-free linear quadratic regulator (LQR) control when gradients are transmitted over rate-limited channels, proposing an adaptive quantization algorithm that guarantees exponentially fast convergence to the optimal policy without degrading the convergence exponent above a finite bit-rate threshold.
Consider a linear quadratic regulator (LQR) problem being solved in a model-free manner using the policy gradient approach. If the gradient of the quadratic cost is being transmitted across a rate-limited channel, both the convergence and the rate of convergence of the resulting controller may be affected by the bit-rate permitted by the channel. We first pose this problem in a communication-constrained optimization framework and propose a new adaptive quantization algorithm titled Adaptively Quantized Gradient Descent (AQGD). This algorithm guarantees exponentially fast convergence to the globally optimal policy, with no deterioration of the exponent relative to the unquantized setting, above a certain finite threshold bit-rate allowed by the communication channel. We then propose a variant of AQGD that provides similar performance guarantees when applied to solve the model-free LQR problem. Our approach reveals the benefits of adaptive quantization in preserving fast linear convergence rates, and, as such, may be of independent interest to the literature on compressed optimization. Our work also marks a first step towards a more general bridge between the fields of model-free control design and networked control systems.