TweetDrought: A Deep-Learning Drought Impacts Recognizer based on Twitter Data
This addresses the need for social, economic, and environmental impact assessment in drought monitoring, but it is incremental as it applies existing NLP methods to a new data source.
The paper tackled the problem of understanding drought impacts by developing a deep-learning recognizer using Twitter data, achieving a macro-F1 score of 0.89 on a news-based test set and 0.58 on California tweets with validation issues.
Acquiring a better understanding of drought impacts becomes increasingly vital under a warming climate. Traditional drought indices describe mainly biophysical variables and not impacts on social, economic, and environmental systems. We utilized natural language processing and bidirectional encoder representation from Transformers (BERT) based transfer learning to fine-tune the model on the data from the news-based Drought Impact Report (DIR) and then apply it to recognize seven types of drought impacts based on the filtered Twitter data from the United States. Our model achieved a satisfying macro-F1 score of 0.89 on the DIR test set. The model was then applied to California tweets and validated with keyword-based labels. The macro-F1 score was 0.58. However, due to the limitation of keywords, we also spot-checked tweets with controversial labels. 83.5% of BERT labels were correct compared to the keyword labels. Overall, the fine-tuned BERT-based recognizer provided proper predictions and valuable information on drought impacts. The interpretation and analysis of the model were consistent with experiential domain expertise.