Denoising ESG: quantifying data uncertainty from missing data with Machine Learning and prediction intervals
This work addresses data uncertainty issues in ESG ratings for investors and analysts, though it is incremental as it applies existing methods to a specific domain.
The paper tackled the problem of inconsistent ESG ratings caused by missing data by applying established machine learning techniques for imputation and quantifying uncertainty through prediction intervals. The results showed that probabilistic models improve the reliability of ESG ratings by better understanding scores and addressing risks from incomplete data.
Environmental, Social, and Governance (ESG) datasets are frequently plagued by significant data gaps, leading to inconsistencies in ESG ratings due to varying imputation methods. This paper explores the application of established machine learning techniques for imputing missing data in a real-world ESG dataset, emphasizing the quantification of uncertainty through prediction intervals. By employing multiple imputation strategies, this study assesses the robustness of imputation methods and quantifies the uncertainty associated with missing data. The findings highlight the importance of probabilistic machine learning models in providing better understanding of ESG scores, thereby addressing the inherent risks of wrong ratings due to incomplete data. This approach improves imputation practices to enhance the reliability of ESG ratings.