IVLGMar 2, 2020

Warwick Electron Microscopy Datasets

arXiv:2003.01113v414 citationsHas Code
AI Analysis

This provides standardized datasets for the electron microscopy community, but it is incremental as it focuses on data release and visualization enhancements.

The authors released three large electron microscopy datasets (19,769 scanning transmission electron micrographs, 17,266 transmission electron micrographs, and 98,340 simulated exit wavefunctions) to standardize benchmarks and trained variational autoencoders with improvements like encoding normalization and gradient loss for visualization.

Large, carefully partitioned datasets are essential to train neural networks and standardize performance benchmarks. As a result, we have set up new repositories to make our electron microscopy datasets available to the wider community. There are three main datasets containing 19769 scanning transmission electron micrographs, 17266 transmission electron micrographs, and 98340 simulated exit wavefunctions, and multiple variants of each dataset for different applications. To visualize image datasets, we trained variational autoencoders to encode data as 64-dimensional multivariate normal distributions, which we cluster in two dimensions by t-distributed stochastic neighbor embedding. In addition, we have improved dataset visualization with variational autoencoders by introducing encoding normalization and regularization, adding an image gradient loss, and extending t-distributed stochastic neighbor embedding to account for encoded standard deviations. Our datasets, source code, pretrained models, and interactive visualizations are openly available at https://github.com/Jeffrey-Ede/datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes