Toward Safe, Trustworthy and Realistic Augmented Reality User Experience
This addresses safety and trustworthiness issues for AR users, but it is incremental as it builds on existing methods and proposes future directions without presenting new results.
The research tackled the problem of unsafe and untrustworthy augmented reality (AR) content by developing systems like ViDDAR and VIM-Sense to detect task-detrimental attacks using vision-language models and multimodal reasoning.
As augmented reality (AR) becomes increasingly integrated into everyday life, ensuring the safety and trustworthiness of its virtual content is critical. Our research addresses the risks of task-detrimental AR content, particularly that which obstructs critical information or subtly manipulates user perception. We developed two systems, ViDDAR and VIM-Sense, to detect such attacks using vision-language models (VLMs) and multimodal reasoning modules. Building on this foundation, we propose three future directions: automated, perceptually aligned quality assessment of virtual content; detection of multimodal attacks; and adaptation of VLMs for efficient and user-centered deployment on AR devices. Overall, our work aims to establish a scalable, human-aligned framework for safeguarding AR experiences and seeks feedback on perceptual modeling, multimodal AR content implementation, and lightweight model adaptation.