LGApr 25, 2017

Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

arXiv:1704.07511v335 citations
Originality Incremental advance
AI Analysis

This addresses the problem of scalable planning in complex hybrid domains for AI and robotics applications, representing an incremental advance by applying existing deep learning tools to a new domain.

The paper tackles planning in hybrid nonlinear domains with high-dimensional state and action spaces using Tensorflow and RMSProp gradient descent, showing it is competitive with MILP on piecewise linear domains and outperforms interior point methods on nonlinear domains, with scalability demonstrated by solving a large-scale problem with 576,000 parameters in 4 minutes.

Given recent deep learning results that demonstrate the ability to effectively optimize high-dimensional non-convex functions with gradient descent optimization on GPUs, we ask in this paper whether symbolic gradient optimization tools such as Tensorflow can be effective for planning in hybrid (mixed discrete and continuous) nonlinear domains with high dimensional state and action spaces? To this end, we demonstrate that hybrid planning with Tensorflow and RMSProp gradient descent is competitive with mixed integer linear program (MILP) based optimization on piecewise linear planning domains (where we can compute optimal solutions) and substantially outperforms state-of-the-art interior point methods for nonlinear planning domains. Furthermore, we remark that Tensorflow is highly scalable, converging to a strong plan on a large-scale concurrent domain with a total of 576,000 continuous action parameters distributed over a horizon of 96 time steps and 100 parallel instances in only 4 minutes. We provide a number of insights that clarify such strong performance including observations that despite long horizons, RMSProp avoids both the vanishing and exploding gradient problems. Together these results suggest a new frontier for highly scalable planning in nonlinear hybrid domains by leveraging GPUs and the power of recent advances in gradient descent with highly optimized toolkits like Tensorflow.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes