QMLGDATA-ANMED-PHOct 18, 2022

Combination of Raman spectroscopy and chemometrics: A review of recent studies published in the Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy Journal

arXiv:2210.10051v18 citationsh-index: 15
Originality Synthesis-oriented
AI Analysis

This review identifies critical methodological issues in a specific spectroscopy journal, which is incremental but important for researchers in analytical chemistry and spectroscopy to improve reliability.

This review analyzed 57 papers applying Raman spectroscopy with chemometrics, finding that about 70% likely contain unsupported or invalid data due to methodological flaws such as insufficient sample sizes and lack of proper validation.

Raman spectroscopy is a promising technique used for noninvasive analysis of samples in various fields of application due to its ability for fingerprint probing of samples at the molecular level. Chemometrics methods are widely used nowadays for better understanding of the recorded spectral fingerprints of samples and differences in their chemical composition. This review considers a number of manuscripts published in the Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy Journal that presented findings regarding the application of Raman spectroscopy in combination with chemometrics to study samples and their changes caused by different factors. In 57 reviewed manuscripts, we analyzed application of chemometrics algorithms, statistical modeling parameters, utilization of cross validation, sample sizes, as well as the performance of the proposed classification and regression model. We summarized the best strategies for creating classification models and highlighted some common drawbacks when it comes to the application of chemometrics techniques. According to our estimations, about 70% of the papers are likely to contain unsupported or invalid data due to insufficient description of the utilized methods or drawbacks of the proposed classification models. These drawbacks include: (1) insufficient experimental sample size for classification/regression to achieve significant and reliable results, (2) lack of cross validation (or a test set) for verification of the classifier/regression performance, (3) incorrect division of the spectral data into the training and the test/validation sets; (4) improper selection of the PC number to reduce the analyzed spectral data dimension.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes