ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability
This provides a foundational dataset to lower barriers and foster collaboration between machine learning and ecological economics researchers, supporting climate actions and sustainable investment decisions.
The paper tackles the lack of machine learning datasets for sustainability analysis by introducing ExioML, the first benchmark dataset for ecological economics, and demonstrates its usability through a greenhouse gas emission regression task where deep and ensemble models achieved low mean square errors.
The Environmental Extended Multi-Regional Input-Output analysis is the predominant framework in Ecological Economics for assessing the environmental impact of economic activities. This paper introduces ExioML, the first Machine Learning benchmark dataset designed for sustainability analysis, aimed at lowering barriers and fostering collaboration between Machine Learning and Ecological Economics research. A crucial greenhouse gas emission regression task was conducted to evaluate sectoral sustainability and demonstrate the usability of the dataset. We compared the performance of traditional shallow models with deep learning models, utilizing a diverse Factor Accounting table and incorporating various categorical and numerical features. Our findings reveal that ExioML, with its high usability, enables deep and ensemble models to achieve low mean square errors, establishing a baseline for future Machine Learning research. Through ExioML, we aim to build a foundational dataset supporting various Machine Learning applications and promote climate actions and sustainable investment decisions.