Chunkflow: Distributed Hybrid Cloud Processing of Large 3D Images by Convolutional Nets
This addresses the problem of handling teravoxel to petavoxel images in biomedical research, though it is incremental as it builds on existing ConvNet methods with a focus on distributed processing.
The authors tackled the challenge of processing large 3D biomedical images using 3D Convolutional Networks by introducing Chunkflow, a distributed software framework that leverages local and cloud GPUs/CPUs, which reduced costs by utilizing cheap unstable cloud instances.
It is now common to process volumetric biomedical images using 3D Convolutional Networks (ConvNets). This can be challenging for the teravoxel and even petavoxel images that are being acquired today by light or electron microscopy. Here we introduce chunkflow, a software framework for distributing ConvNet processing over local and cloud GPUs and CPUs. The image volume is divided into overlapping chunks, each chunk is processed by a ConvNet, and the results are blended together to yield the output image. The frontend submits ConvNet tasks to a cloud queue. The tasks are executed by local and cloud GPUs and CPUs. Thanks to the fault-tolerant architecture of Chunkflow, cost can be greatly reduced by utilizing cheap unstable cloud instances. Chunkflow currently supports PyTorch for GPUs and PZnet for CPUs. To illustrate its usage, a large 3D brain image from serial section electron microscopy was processed by a 3D ConvNet with a U-Net style architecture. Chunkflow provides some chunk operations for general use, and the operations can be composed flexibly in a command line interface.