AIMay 27, 2022

GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis

arXiv:2205.13728v123 citationsh-index: 24
Originality Incremental advance
AI Analysis

This addresses the need for more interpretable and generalizable AI in complex decision-making, though it appears incremental as it combines existing paradigms.

The paper tackles the problem of deep reinforcement learning lacking high-order intelligence like logic deduction, by proposing GALOIS, a framework that synthesizes hierarchical and strict cause-effect logic programs, resulting in superior asymptotic performance, generalizability, and knowledge reusability across various decision-making tasks.

Despite achieving superior performance in human-level control problems, unlike humans, deep reinforcement learning (DRL) lacks high-order intelligence (e.g., logic deduction and reuse), thus it behaves ineffectively than humans regarding learning and generalization in complex problems. Previous works attempt to directly synthesize a white-box logic program as the DRL policy, manifesting logic-driven behaviors. However, most synthesis methods are built on imperative or declarative programming, and each has a distinct limitation, respectively. The former ignores the cause-effect logic during synthesis, resulting in low generalizability across tasks. The latter is strictly proof-based, thus failing to synthesize programs with complex hierarchical logic. In this paper, we combine the above two paradigms together and propose a novel Generalizable Logic Synthesis (GALOIS) framework to synthesize hierarchical and strict cause-effect logic programs. GALOIS leverages the program sketch and defines a new sketch-based hybrid program language for guiding the synthesis. Based on that, GALOIS proposes a sketch-based program synthesis method to automatically generate white-box programs with generalizable and interpretable cause-effect logic. Extensive evaluations on various decision-making tasks with complex logic demonstrate the superiority of GALOIS over mainstream baselines regarding the asymptotic performance, generalizability, and great knowledge reusability across different environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes