Rulebook: bringing co-routines to reinforcement learning environments
This addresses a bottleneck for RL researchers and developers by simplifying environment implementation, though it is incremental as it builds on existing co-routine concepts.
The paper tackles the problem of high development cost and performance overhead in implementing reinforcement learning environments by proposing Rulebook, a domain-specific language that automatically generates state machines, enabling the creation of larger and more sophisticated environments without performance penalties.
Reinforcement learning (RL) algorithms, due to their reliance on external systems to learn from, require digital environments (e.g., simulators) with very simple interfaces, which in turn constrain significantly the implementation of such environments. In particular, these environments are implemented either as separate processes or as state machines, leading to synchronization and communication overheads in the first case, and to unstructured programming in the second. We propose a new domain-specific, co-routine-based, compiled language, called Rulebook, designed to automatically generate the state machine required to interact with machine learning (ML) algorithms and similar applications, with no performance overhead. Rulebook allows users to express programs without needing to be aware of the specific interface required by the ML components. By decoupling the execution model of the program from the syntactical encoding of the program, and thus without the need for manual state management, Rulebook allows to create larger and more sophisticated environments at a lower development cost.