CVLGMar 15, 2025

Exploration of VLMs for Driver Monitoring Systems Applications

arXiv:2503.12281v13 citationsh-index: 13
Originality Synthesis-oriented
AI Analysis

This work addresses the need for advanced AI in automotive safety systems, but it appears incremental as it applies existing VLMs to a new domain without claiming major breakthroughs.

This paper tackles the problem of applying Vision-Language Models (VLMs) to Driver Monitoring Systems (DMS), where there is a notable gap in scientific literature, by implementing VLMs on the Driver Monitoring Dataset to evaluate performance and discuss real-world advantages and challenges.

In recent years, we have witnessed significant progress in emerging deep learning models, particularly Large Language Models (LLMs) and Vision-Language Models (VLMs). These models have demonstrated promising results, indicating a new era of Artificial Intelligence (AI) that surpasses previous methodologies. Their extensive knowledge and zero-shot capabilities suggest a paradigm shift in developing deep learning solutions, moving from data capturing and algorithm training to just writing appropriate prompts. While the application of these technologies has been explored across various industries, including automotive, there is a notable gap in the scientific literature regarding their use in Driver Monitoring Systems (DMS). This paper presents our initial approach to implementing VLMs in this domain, utilising the Driver Monitoring Dataset to evaluate their performance and discussing their advantages and challenges when implemented in real-world scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes