CVAIMMMar 11, 2025

EgoBlind: Towards Egocentric Visual Assistance for the Blind

arXiv:2503.08221v419 citationsh-index: 17Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the need for effective AI assistants to enhance independence for blind and visually impaired people, though it is incremental as it focuses on dataset creation and benchmarking rather than novel model development.

The authors tackled the problem of evaluating multimodal large language models (MLLMs) for egocentric visual assistance for blind individuals by introducing EgoBlind, a dataset with 1,392 first-person videos and 5,311 questions, and found that the best models achieved only 60% accuracy, far behind human performance of 87.4%.

We present EgoBlind, the first egocentric VideoQA dataset collected from blind individuals to evaluate the assistive capabilities of contemporary multimodal large language models (MLLMs). EgoBlind comprises 1,392 first-person videos from the daily lives of blind and visually impaired individuals. It also features 5,311 questions directly posed or verified by the blind to reflect their in-situation needs for visual assistance. Each question has an average of 3 manually annotated reference answers to reduce subjectiveness. Using EgoBlind, we comprehensively evaluate 16 advanced MLLMs and find that all models struggle. The best performers achieve an accuracy near 60\%, which is far behind human performance of 87.4\%. To guide future advancements, we identify and summarize major limitations of existing MLLMs in egocentric visual assistance for the blind and explore heuristic solutions for improvement. With these efforts, we hope that EgoBlind will serve as a foundation for developing effective AI assistants to enhance the independence of the blind and visually impaired. Data and code are available at https://github.com/doc-doc/EgoBlind.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes