Comments on the Du-Kakade-Wang-Yang Lower Bounds
This work clarifies theoretical inconsistencies in reinforcement learning theory, which is important for researchers in machine learning and AI, but it is incremental as it synthesizes existing results.
The paper addresses conflicting results on the tractability of reinforcement learning with misspecified representations, comparing lower bounds on sample complexity by Du et al. with tractability findings based on the eluder dimension to reconcile interpretations.
Du, Kakade, Wang, and Yang recently established intriguing lower bounds on sample complexity, which suggest that reinforcement learning with a misspecified representation is intractable. Another line of work, which centers around a statistic called the eluder dimension, establishes tractability of problems similar to those considered in the Du-Kakade-Wang-Yang paper. We compare these results and reconcile interpretations.