Analyzing Near-Infrared Hyperspectral Imaging for Protein Content Regression and Grain Variety Classification Using Bulk References and Varying Grain-to-Background Ratios
This work addresses incremental improvements in agricultural analysis for researchers, focusing on bias mitigation and robustness in hyperspectral imaging applications.
The study tackled protein content regression and grain variety classification using NIR-HSI images, finding that adjusting for biases from bulk reference data improved mean protein predictions and that higher grain-to-background ratios increased accuracy while including lower ratios enhanced model robustness.
Based on previous work, we assess the use of NIR-HSI images for calibrating models on two datasets, focusing on protein content regression and grain variety classification. Limited reference data for protein content is expanded by subsampling and associating it with the bulk sample. However, this method introduces significant biases due to skewed leptokurtic prediction distributions, affecting both PLS-R and deep CNN models. We propose adjustments to mitigate these biases, improving mean protein reference predictions. Additionally, we investigate the impact of grain-to-background ratios on both tasks. Higher ratios yield more accurate predictions, but including lower-ratio images in calibration enhances model robustness for such scenarios.