Learning to generate Reliable Broadcast Algorithms
This work addresses the challenge of automating algorithm development for distributed systems, which is typically manual and complex, offering a novel approach that could streamline the process for researchers and engineers.
The paper tackled the problem of manually developing fault-tolerant distributed algorithms by introducing a reinforcement learning agent that automatically generates correct and efficient Reliable Broadcast algorithms, achieving performance comparable to existing literature in just 12,000 learning episodes.
Modern distributed systems are supported by fault-tolerant algorithms, like Reliable Broadcast and Consensus, that assure the correct operation of the system even when some of the nodes of the system fail. However, the development of distributed algorithms is a manual and complex process, resulting in scientific papers that usually present a single algorithm or variations of existing ones. To automate the process of developing such algorithms, this work presents an intelligent agent that uses Reinforcement Learning to generate correct and efficient fault-tolerant distributed algorithms. We show that our approach is able to generate correct fault-tolerant Reliable Broadcast algorithms with the same performance of others available in the literature, in only 12,000 learning episodes.