LGMay 30

A Comparative Analysis of Machine Learning Algorithms for Multi-Task Prediction of the Parameters of the Pectin Hydrolysis--Extraction Process

arXiv:2606.008211.0
Predicted impact top 90% in LG · last 90 daysOriginality Synthesis-oriented
AI Analysis

For the pectin production industry, this work provides a production-ready predictive model that reduces the need for physical experiments, though the approach is incremental as it applies existing methods to a specific domain.

This study compared 11 machine learning algorithms for multi-task prediction of pectin hydrolysis-extraction parameters, finding CatBoost achieved the highest average R-squared of approximately 0.946 after hyperparameter optimization, with raw material type being the most important feature.

This study addresses the challenge of controlling a complex, multi-parameter technological process -- pectin hydrolysis--extraction -- using machine learning methods. The experimental foundation is a unique database comprising 1,000 laboratory experiments conducted under controlled conditions on seven types of plant raw material with four variable process factors (temperature 85--130 C, pressure 0.9--2.2 atm, holding time 3--10 min, pH 1.5--2.0). Four output characteristics were recorded: pectin yield, galacturonic acid content, molecular weight, and degree of esterification. To solve the multi-task regression problem, 11 algorithms were trained and compared: regularised linear models, ensemble methods (Random Forest, Gradient Boosting, XGBoost, CatBoost, Extra Trees), k-nearest neighbours, support vector regression, and a multilayer perceptron. The best results were demonstrated by CatBoost (average R-squared approximately 0.946 after hyperparameter optimisation). Feature importance analysis revealed the dominant role of the raw material type (63.6% of total importance), followed by temperature and holding time. The developed pipeline was exported in a production-ready format and deployed as an interactive web interface. The findings demonstrate that ensemble methods combined with rigorous statistical analysis and interpretable AI significantly reduce the need for physical experiments and form the basis for intelligent pectin production control.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes