NE DC LGAug 27, 2020

CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity Edge Devices

Parth Mannan, Ananda Samajdar, Tushar Krishna

arXiv:2008.11881v14.43 citations

Originality Incremental advance

AI Analysis

This addresses bandwidth, privacy, and connectivity issues for autonomous agents on edge devices, though it is incremental in improving existing distributed learning methods.

The paper tackles the problem of enabling continuous learning on edge devices without cloud interaction by building a prototype distributed system using Raspberry Pis and NeuroEvolutionary learning, achieving up to 3.6x reduction in communication during learning to match higher-end device performance.

Recent advancements in machine learning algorithms, especially the development of Deep Neural Networks (DNNs) have transformed the landscape of Artificial Intelligence (AI). With every passing day, deep learning based methods are applied to solve new problems with exceptional results. The portal to the real world is the edge. The true impact of AI can only be fully realized if we can have AI agents continuously interacting with the real world and solving everyday problems. Unfortunately, high compute and memory requirements of DNNs acts a huge barrier towards this vision. Today we circumvent this problem by deploying special purpose inference hardware on the edge while procuring trained models from the cloud. This approach, however, relies on constant interaction with the cloud for transmitting all the data, training on massive GPU clusters, and downloading updated models. This is challenging for bandwidth, privacy, and constant connectivity concerns that autonomous agents may exhibit. In this paper we evaluate techniques for enabling adaptive intelligence on edge devices with zero interaction with any high-end cloud/server. We build a prototype distributed system of Raspberry Pis communicating via WiFi running NeuroEvolutionary (NE) learning and inference. We evaluate the performance of such a collaborative system and detail the compute/communication characteristics of different arrangements of the system that trade-off parallelism versus communication. Using insights from our analysis, we also propose algorithmic modifications to reduce communication by up to 3.6x during the learning phase to enhance scalability even further and match performance of higher end computing devices at scale. We believe that these insights will enable algorithm-hardware co-design efforts for enabling continuous learning on the edge.

View on arXiv PDF

Similar