LGAISep 17, 2021

Soft Actor-Critic With Integer Actions

arXiv:2109.08512v216 citations
Originality Incremental advance
AI Analysis

This addresses a challenging high-dimensional integer action problem in industry applications like robotics and power systems, offering an incremental improvement over existing methods.

The paper tackled reinforcement learning with integer actions by integrating Soft Actor-Critic with an integer reparameterization, eliminating the need for one-hot encoding and reducing dimensionality. Experiments showed it performed comparably to continuous action SAC on robot control and outperformed Proximal Policy Optimization on power distribution tasks.

Reinforcement learning is well-studied under discrete actions. Integer actions setting is popular in the industry yet still challenging due to its high dimensionality. To this end, we study reinforcement learning under integer actions by incorporating the Soft Actor-Critic (SAC) algorithm with an integer reparameterization. Our key observation for integer actions is that their discrete structure can be simplified using their comparability property. Hence, the proposed integer reparameterization does not need one-hot encoding and is of low dimensionality. Experiments show that the proposed SAC under integer actions is as good as the continuous action version on robot control tasks and outperforms Proximal Policy Optimization on power distribution systems control tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes