DCCVDBOct 24, 2016

Savu: A Python-based, MPI Framework for Simultaneous Processing of Multiple, N-dimensional, Large Tomography Datasets

arXiv:1610.08015v167 citations
Originality Incremental advance
AI Analysis

This addresses data processing bottlenecks for scientists at synchrotron facilities, offering a flexible and portable solution, though it is incremental as it builds on existing modular and parallel processing concepts.

The paper tackles the challenge of processing large, multi-dimensional tomography datasets at synchrotron facilities by introducing Savu, a Python-based MPI framework that enables serial processing on PCs or parallel processing across clusters, successfully deployed at Diamond Light Source to handle vast data volumes from over 3000 users annually.

Diamond Light Source (DLS), the UK synchrotron facility, attracts scientists from across the world to perform ground-breaking x-ray experiments. With over 3000 scientific users per year, vast amounts of data are collected across the experimental beamlines, with the highest volume of data collected during tomographic imaging experiments. A growing interest in tomography as an imaging technique, has led to an expansion in the range of experiments performed, in addition to a growth in the size of the data per experiment. Savu is a portable, flexible, scientific processing pipeline capable of processing multiple, n-dimensional datasets in serial on a PC, or in parallel across a cluster. Developed at DLS, and successfully deployed across the beamlines, it uses a modular plugin format to enable experiment-specific processing and utilises parallel HDF5 to remove RAM restrictions. The Savu design, described throughout this paper, focuses on easy integration of existing and new functionality, flexibility and ease of use for users and developers alike.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes