Multimodal Learning with Augmentation Techniques for Natural Disaster Assessment
This work addresses data scarcity issues for researchers developing disaster assessment systems using social media, though it appears incremental in applying existing augmentation methods to a specific domain.
This paper tackled the problem of class imbalance and limited samples in natural disaster assessment datasets by exploring augmentation techniques on the CrisisMMD multimodal dataset, finding that selected augmentations improved classification performance for underrepresented classes.
Natural disaster assessment relies on accurate and rapid access to information, with social media emerging as a valuable real-time source. However, existing datasets suffer from class imbalance and limited samples, making effective model development a challenging task. This paper explores augmentation techniques to address these issues on the CrisisMMD multimodal dataset. For visual data, we apply diffusion-based methods, namely Real Guidance and DiffuseMix. For text data, we explore back-translation, paraphrasing with transformers, and image caption-based augmentation. We evaluated these across unimodal, multimodal, and multi-view learning setups. Results show that selected augmentations improve classification performance, particularly for underrepresented classes, while multi-view learning introduces potential but requires further refinement. This study highlights effective augmentation strategies for building more robust disaster assessment systems.