Trust in Vision-Language Models: Insights from a Participatory User Workshop
This addresses the challenge of user trust in VLMs for developers and researchers, but it is incremental as it presents preliminary results from a pilot workshop.
The paper tackled the problem of understanding how user trust in Vision-Language Models builds and evolves by conducting a participatory workshop with prospective users, resulting in preliminary insights to inform future studies on trust metrics and engagement strategies.
With the growing deployment of Vision-Language Models (VLMs), pre-trained on large image-text and video-text datasets, it is critical to equip users with the tools to discern when to trust these systems. However, examining how user trust in VLMs builds and evolves remains an open problem. This problem is exacerbated by the increasing reliance on AI models as judges for experimental validation, to bypass the cost and implications of running participatory design studies directly with users. Following a user-centred approach, this paper presents preliminary results from a workshop with prospective VLM users. Insights from this pilot workshop inform future studies aimed at contextualising trust metrics and strategies for participants' engagement to fit the case of user-VLM interaction.