LG DATA-ANAug 8, 2025

Comparative study of machine learning and statistical methods for automatic identification and quantification in γ-ray spectrometry

Dinh Triem Phan, Jérôme Bobin, Cheick Thiam, Christophe Bobin

arXiv:2508.08306v14.11 citationsh-index: 3Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the lack of benchmarks for evaluating methods in γ-ray spectrometry, providing an open-source benchmark for researchers in nuclear science and radiation detection, though it is incremental as it compares existing methods rather than introducing new ones.

The authors tackled the problem of automatic identification and quantification in γ-ray spectrometry by comparing state-of-the-art end-to-end machine learning with a statistical unmixing approach, finding that the statistical method consistently outperformed machine learning in identification across all scenarios, but its performance degraded with poorly modeled spectral signatures, while machine learning offered a viable alternative under uncertain conditions.

During the last decade, a large number of different numerical methods have been proposed to tackle the automatic identification and quantification in γ-ray spectrometry. However, the lack of common benchmarks, including datasets, code and comparison metrics, makes their evaluation and comparison hard. In that context, we propose an open-source benchmark that comprises simulated datasets of various γ-spectrometry settings, codes of different analysis approaches and evaluation metrics. This allows us to compare the state-of-the-art end-to-end machine learning with a statistical unmixing approach using the full spectrum. Three scenarios have been investigated: (1) spectral signatures are assumed to be known; (2) spectral signatures are deformed due to physical phenomena such as Compton scattering and attenuation; and (3) spectral signatures are shifted (e.g., due to temperature variation). A large dataset of 200000 simulated spectra containing nine radionuclides with an experimental natural background is used for each scenario with multiple radionuclides present in the spectrum. Regarding identification performance, the statistical approach consistently outperforms the machine learning approaches across all three scenarios for all comparison metrics. However, the performance of the statistical approach can be significantly impacted when spectral signatures are not modeled correctly. Consequently, the full-spectrum statistical approach is most effective with known or well-modeled spectral signatures, while end-to-end machine learning is a good alternative when measurement conditions are uncertain for radionuclide identification. Concerning the quantification task, the statistical approach provides accurate estimates of radionuclide counting, while the machine learning methods deliver less satisfactory results.

View on arXiv PDF

Similar