Edward A. Lee

h-index78

4papers

44citations

Novelty56%

AI Score34

Ranked #114,903 of 194,257 authors (top 59%)#539 in DC (top 55%)

4 Papers

2.3DCDec 7, 2023Code

Efficient Parallel Reinforcement Learning Framework using the Reactor Model

Jacky Kwok, Marten Lohstroh, Edward A. Lee

Parallel Reinforcement Learning (RL) frameworks are essential for mapping RL workloads to multiple computational resources, allowing for faster generation of samples, estimation of values, and policy improvement. These computational paradigms require a seamless integration of training, serving, and simulation workloads. Existing frameworks, such as Ray, are not managing this orchestration efficiently, especially in RL tasks that demand intensive input/output and synchronization between actors on a single node. In this study, we have proposed a solution implementing the reactor model, which enforces a set of actors to have a fixed communication pattern. This allows the scheduler to eliminate work needed for synchronization, such as acquiring and releasing locks for each actor or sending and processing coordination-related messages. Our framework, Lingua Franca (LF), a coordination language based on the reactor model, also supports true parallelism in Python and provides a unified interface that allows users to automatically generate dataflow graphs for RL tasks. In comparison to Ray on a single-node multi-core compute platform, LF achieves 1.21x and 11.62x higher simulation throughput in OpenAI Gym and Atari environments, reduces the average training time of synchronized parallel Q-learning by 31.2%, and accelerates multi-agent RL inference by 5.12x.

2.2RODec 2, 2024

HPRM: High-Performance Robotic Middleware for Intelligent Autonomous Systems

Jacky Kwok, Shulu Li, Marten Lohstroh et al.

The rise of intelligent autonomous systems, especially in robotics and autonomous agents, has created a critical need for robust communication middleware that can ensure real-time processing of extensive sensor data. Current robotics middleware like Robot Operating System (ROS) 2 faces challenges with nondeterminism and high communication latency when dealing with large data across multiple subscribers on a multi-core compute platform. To address these issues, we present High-Performance Robotic Middleware (HPRM), built on top of the deterministic coordination language Lingua Franca (LF). HPRM employs optimizations including an in-memory object store for efficient zero-copy transfer of large payloads, adaptive serialization to minimize serialization overhead, and an eager protocol with real-time sockets to reduce handshake latency. Benchmarks show HPRM achieves up to 173x lower latency than ROS2 when broadcasting large messages to multiple nodes. We then demonstrate the benefits of HPRM by integrating it with the CARLA simulator and running reinforcement learning agents along with object detection workloads. In the CARLA autonomous driving application, HPRM attains 91.1% lower latency than ROS2. The deterministic coordination semantics of HPRM, combined with its optimized IPC mechanisms, enable efficient and predictable real-time communication for intelligent autonomous systems.

15.5LOJul 20, 2018Code

Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning

Gil Lederman, Markus N. Rabe, Edward A. Lee et al.

We demonstrate how to learn efficient heuristics for automated reasoning algorithms for quantified Boolean formulas through deep reinforcement learning. We focus on a backtracking search algorithm, which can already solve formulas of impressive size - up to hundreds of thousands of variables. The main challenge is to find a representation of these formulas that lends itself to making predictions in a scalable way. For a family of challenging problems, we learned a heuristic that solves significantly more formulas compared to the existing handwritten heuristics.

4.1SEJul 14, 2013

Numerical LTL Synthesis for Cyber-Physical Systems

Chih-Hong Cheng, Edward A. Lee

Cyber-physical systems (CPS) are systems that interact with the physical world via sensors and actuators. In such a system, the reading of a sensor represents measures of a physical quantity, and sensor values are often reals ranged over bounded intervals. The implementation of control laws is based on nonlinear numerical computations over the received sensor values. Synthesizing controllers fulfilling features within CPS brings a huge challenge to the research community in formal methods, as most of the works in automatic controller synthesis (LTL synthesis) are restricted to specifications having a few discrete inputs within the Boolean domain. In this report, we present a novel approach that addresses the above challenge to synthesize controllers for CPS. Our core methodology, called numerical LTL synthesis, extends LTL synthesis by using inputs or outputs in real numbers and by allowing predicates of polynomial constraints to be defined within an LTL formula as specification. The synthesis algorithm is based on an interplay between an LTL synthesis engine which handles the pseudo-Boolean structure, together with a nonlinear constraint validity checker which tests the (in)feasibility of a (counter-)strategy. The methodology is integrated within the CPS research framework Ptolemy II via the development of an LTL synthesis module G4LTL and a validity checker JBernstein. Although we only target the theory of nonlinear real arithmetic, the use of pseudo-Boolean synthesis framework also allows an easy extension to embed a richer set of theories, making the technique applicable to a much broader audience.