QMCVApr 1, 2016

Large-Scale Electron Microscopy Image Segmentation in Spark

arXiv:1604.00385v115 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of automating neuronal tracing in connectomics for researchers, offering a scalable solution to process terabytes of data, though it is incremental as it builds on existing segmentation methods with a focus on robustness and efficiency.

The paper tackles the challenge of segmenting large-scale electron microscopy images for connectomics by proposing a novel strategy to handle datasets exceeding single-machine capacity, implementing it in Spark to minimize disk I/O and demonstrating effectiveness and scalability on large EM datasets.

The emerging field of connectomics aims to unlock the mysteries of the brain by understanding the connectivity between neurons. To map this connectivity, we acquire thousands of electron microscopy (EM) images with nanometer-scale resolution. After aligning these images, the resulting dataset has the potential to reveal the shapes of neurons and the synaptic connections between them. However, imaging the brain of even a tiny organism like the fruit fly yields terabytes of data. It can take years of manual effort to examine such image volumes and trace their neuronal connections. One solution is to apply image segmentation algorithms to help automate the tracing tasks. In this paper, we propose a novel strategy to apply such segmentation on very large datasets that exceed the capacity of a single machine. Our solution is robust to potential segmentation errors which could otherwise severely compromise the quality of the overall segmentation, for example those due to poor classifier generalizability or anomalies in the image dataset. We implement our algorithms in a Spark application which minimizes disk I/O, and apply them to a few large EM datasets, revealing both their effectiveness and scalability. We hope this work will encourage external contributions to EM segmentation by providing 1) a flexible plugin architecture that deploys easily on different cluster environments and 2) an in-memory representation of segmentation that could be conducive to new advances.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes