Dark solitons in Bose-Einstein condensates: a dataset for many-body physics research
This provides a resource for data scientists and physicists to develop analysis tools and advance research in nonlinear many-body physics and cold atom experiments, but it is incremental as it primarily offers a new dataset.
The authors created a dataset of over 16,000 experimental images of Bose-Einstein condensates with solitonic excitations to facilitate machine learning applications in many-body physics, with about 33% manually labeled and the rest automatically labeled using a physics-informed ML framework.
We establish a dataset of over $1.6\times10^4$ experimental images of Bose--Einstein condensates containing solitonic excitations to enable machine learning (ML) for many-body physics research. About $33~\%$ of this dataset has manually assigned and carefully curated labels. The remainder is automatically labeled using SolDet -- an implementation of a physics-informed ML data analysis framework -- consisting of a convolutional-neural-network-based classifier and OD as well as a statistically motivated physics-informed classifier and a quality metric. This technical note constitutes the definitive reference of the dataset, providing an opportunity for the data science community to develop more sophisticated analysis tools, to further understand nonlinear many-body physics, and even advance cold atom experiments.