Leveraging Language Models to Detect Greenwashing
This addresses the issue of unregulated greenwashing in corporate communications, offering a proof-of-concept tool for stakeholders, though it is incremental as it builds on existing models like ClimateBERT.
The paper tackled the problem of detecting greenwashing in corporate sustainability reports by introducing a language model-based methodology, achieving an average accuracy of 86.34% and an F1 score of 0.67 on a test set.
In recent years, climate change repercussions have increasingly captured public interest. Consequently, corporations are emphasizing their environmental efforts in sustainability reports to bolster their public image. Yet, the absence of stringent regulations in review of such reports allows potential greenwashing. In this study, we introduce a novel preliminary methodology to train a language model on generated labels for greenwashing risk. Our primary contributions encompass: developing a preliminary mathematical formulation to quantify greenwashing risk, a fine-tuned ClimateBERT model for this problem, and a comparative analysis of results. On a test set comprising of sustainability reports, our best model achieved an average accuracy score of 86.34% and F1 score of 0.67, demonstrating that our proof-of-concept methodology shows a promising direction of exploration for this task.