A Survey on Safe Multi-Modal Learning System
This addresses safety concerns for multimodal learning systems in critical domains like healthcare, but it's an incremental contribution as a survey/taxonomy rather than a novel method.
The authors tackled the lack of systematic research on safety in multimodal learning systems by creating the first taxonomy to categorize and assess MMLS safety across four pillars: robustness, alignment, monitoring, and controllability. They reviewed existing methodologies and benchmarks to identify limitations and propose future research directions.
In the rapidly evolving landscape of artificial intelligence, multimodal learning systems (MMLS) have gained traction for their ability to process and integrate information from diverse modality inputs. Their expanding use in vital sectors such as healthcare has made safety assurance a critical concern. However, the absence of systematic research into their safety is a significant barrier to progress in this field. To bridge the gap, we present the first taxonomy that systematically categorizes and assesses MMLS safety. This taxonomy is structured around four fundamental pillars that are critical to ensuring the safety of MMLS: robustness, alignment, monitoring, and controllability. Leveraging this taxonomy, we review existing methodologies, benchmarks, and the current state of research, while also pinpointing the principal limitations and gaps in knowledge. Finally, we discuss unique challenges in MMLS safety. In illuminating these challenges, we aim to pave the way for future research, proposing potential directions that could lead to significant advancements in the safety protocols of MMLS.