A Linearly Relaxed Approximate Linear Program for Markov Decision Processes
For researchers working on large-scale MDPs, this offers a principled way to reduce constraint complexity while maintaining performance guarantees.
The paper addresses the intractable number of constraints in approximate linear programming for large Markov Decision Processes by proposing a linearly relaxed ALP with a tractable number of constraints, and provides a novel performance bound for this relaxation.
Approximate linear programming (ALP) and its variants have been widely applied to Markov Decision Processes (MDPs) with a large number of states. A serious limitation of ALP is that it has an intractable number of constraints, as a result of which constraint approximations are of interest. In this paper, we define a linearly relaxed approximation linear program (LRALP) that has a tractable number of constraints, obtained as positive linear combinations of the original constraints of the ALP. The main contribution is a novel performance bound for LRALP.