Soft Actor-Critic for Discrete Action Settings
This addresses a limitation for reinforcement learning practitioners working with discrete action environments, though it is incremental as it adapts an existing algorithm.
The paper tackled the problem of applying the Soft Actor-Critic algorithm to discrete action settings, which it was not originally designed for, and showed that the derived alternative version is competitive with tuned state-of-the-art methods on Atari games without hyperparameter tuning.
Soft Actor-Critic is a state-of-the-art reinforcement learning algorithm for continuous action settings that is not applicable to discrete action settings. Many important settings involve discrete actions, however, and so here we derive an alternative version of the Soft Actor-Critic algorithm that is applicable to discrete action settings. We then show that, even without any hyperparameter tuning, it is competitive with the tuned model-free state-of-the-art on a selection of games from the Atari suite.