Experimenting with Emerging RISC-V Systems for Decentralised Machine Learning
This work addresses the problem of limited experimentation capabilities for researchers and developers in DML, particularly for novel processors and network topologies, though it is incremental as it builds on existing tools and methods.
The authors tackled the lack of flexibility and portability in tools for Decentralised Machine Learning (DML) by developing a domain-specific language that maps DML schemes to the FastFlow library, enabling experimentation on x86-64, ARM, and RISC-V platforms, with results including performance and energy efficiency characterizations and a first publicly available RISC-V port of PyTorch.
Decentralised Machine Learning (DML) enables collaborative machine learning without centralised input data. Federated Learning (FL) and Edge Inference are examples of DML. While tools for DML (especially FL) are starting to flourish, many are not flexible and portable enough to experiment with novel processors (e.g., RISC-V), non-fully connected network topologies, and asynchronous collaboration schemes. We overcome these limitations via a domain-specific language allowing us to map DML schemes to an underlying middleware, i.e. the FastFlow parallel programming library. We experiment with it by generating different working DML schemes on x86-64 and ARM platforms and an emerging RISC-V one. We characterise the performance and energy efficiency of the presented schemes and systems. As a byproduct, we introduce a RISC-V porting of the PyTorch framework, the first publicly available to our knowledge.