HCOct 17, 2024
An AI Guide to Enhance Accessibility of Social Virtual Reality for Blind PeopleJazmin Collins, Kaylah Myranda Nicholson, Yusuf Khadir et al.
The rapid growth of virtual reality (VR) has led to increased use of social VR platforms for interaction. However, these platforms lack adequate features to support blind and low vision (BLV) users, posing significant challenges in navigation, visual interpretation, and social interaction. One promising approach to these challenges is employing human guides in VR. However, this approach faces limitations with a lack of availability of humans to serve as guides, or the inability to customize the guidance a user receives from the human guide. We introduce an AI-powered guide to address these limitations. The AI guide features six personas, each offering unique behaviors and appearances to meet diverse user needs, along with visual interpretation and navigation assistance. We aim to use this AI guide in the future to help us understand BLV users' preferences for guide forms and functionalities.
HCMar 10
Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessible for Blind and Low Vision PeopleJazmin Collins, Sharon Y Lin, Tianqi Liu et al.
As social virtual reality (VR) grows more popular, addressing accessibility for blind and low vision (BLV) users is increasingly critical. Researchers have proposed an AI "sighted guide" to help users navigate VR and answer their questions, but it has not been studied with users. To address this gap, we developed a large language model (LLM)-powered guide and studied its use with 16 BLV participants in virtual environments with confederates posing as other users. We found that when alone, participants treated the guide as a tool, but treated it companionably around others, giving it nicknames, rationalizing its mistakes with its appearance, and encouraging confederate-guide interaction. Our work furthers understanding of guides as a versatile method for VR accessibility and presents design recommendations for future guides.
HCFeb 13
How Multimodal Large Language Models Support Access to Visual Information: A Diary Study With Blind and Low Vision PeopleRicardo E. Gonzalez Penuela, Crescentia Jung, Sharon Y Lin et al.
Multimodal large language models (MLLMs) are changing how Blind and Low Vision (BLV) people access visual information. Unlike traditional visual interpretation tools that only provide descriptions, MLLM-enabled applications offer conversational assistance, where users can ask questions to obtain goal-relevant details. However, evidence about their performance in the real-world and implications for BLV people's daily lives remains limited. To address this, we conducted a two-week diary study, where we captured 20 BLV participants' use of an MLLM-enabled visual interpretation application. Although participants rated the visual interpretations of the application as "trustworthy" (mean=3.76 out of 5, max=extremely trustworthy) and "somewhat satisfying" (mean=4.13 out of 5, max=very satisfying), the AI often produced incorrect answers (22.2%) or abstained (10.8%) from responding to users' requests. Our findings show that while MLLMs can improve visual interpretations' descriptive accuracy, supporting everyday use also depends on the "visual assistant" skill: behaviors for providing goal-directed, reliable assistance. We conclude by proposing the "visual assistant" skill and guidelines to help MLLM-enabled visual interpretation applications better support BLV people's access to visual information.
HCMar 22, 2024
Investigating Use Cases of AI-Powered Scene Description Applications for Blind and Low Vision PeopleRicardo Gonzalez, Jazmin Collins, Shiri Azenkot et al.
"Scene description" applications that describe visual content in a photo are useful daily tools for blind and low vision (BLV) people. Researchers have studied their use, but they have only explored those that leverage remote sighted assistants; little is known about applications that use AI to generate their descriptions. Thus, to investigate their use cases, we conducted a two-week diary study where 16 BLV participants used an AI-powered scene description application we designed. Through their diary entries and follow-up interviews, users shared their information goals and assessments of the visual descriptions they received. We analyzed the entries and found frequent use cases, such as identifying visual features of known objects, and surprising ones, such as avoiding contact with dangerous objects. We also found users scored the descriptions relatively low on average, 2.76 out of 5 (SD=1.49) for satisfaction and 2.43 out of 4 (SD=1.16) for trust, showing that descriptions still need significant improvements to deliver satisfying and trustworthy experiences. We discuss future opportunities for AI as it becomes a more powerful accessibility tool for BLV users.
HCMar 7, 2025
Towards Understanding the Use of MLLM-Enabled Applications for Visual Interpretation by Blind and Low Vision PeopleRicardo E. Gonzalez Penuela, Ruiying Hu, Sharon Lin et al.
Blind and Low Vision (BLV) people have adopted AI-powered visual interpretation applications to address their daily needs. While these applications have been helpful, prior work has found that users remain unsatisfied by their frequent errors. Recently, multimodal large language models (MLLMs) have been integrated into visual interpretation applications, and they show promise for more descriptive visual interpretations. However, it is still unknown how this advancement has changed people's use of these applications. To address this gap, we conducted a two-week diary study in which 20 BLV people used an MLLM-enabled visual interpretation application we developed, and we collected 553 entries. In this paper, we report a preliminary analysis of 60 diary entries from 6 participants. We found that participants considered the application's visual interpretations trustworthy (mean 3.75 out of 5) and satisfying (mean 4.15 out of 5). Moreover, participants trusted our application in high-stakes scenarios, such as receiving medical dosage advice. We discuss our plan to complete our analysis to inform the design of future MLLM-enabled visual interpretation systems.
HCFeb 24, 2022
Tactile Materials in Practice: Understanding the Experiences of Teachers of the Visually ImpairedMahika Phutane, Julie Wright, Brenda Veronica Castro et al.
Teachers of the visually impaired (TVIs) regularly present tactile materials (tactile graphics, 3D models, and real objects) to students with vision impairments. Researchers have been increasingly interested in designing tools to support the use of tactile materials, but we still lack an in-depth understanding of how tactile materials are created and used in practice today. To address this gap, we conducted interviews with 21 TVIs and a 3-week diary study with eight of them. We found that tactile materials were regularly used for academic as well as non-academic concepts like tactile literacy, motor ability, and spatial awareness. Real objects and 3D models served as "stepping stones" to tactile graphics and our participants preferred to teach with 3D models, despite finding them difficult to create, obtain, and modify. Use of certain materials also carried social implications; participants selected materials that fostered student independence and allow classroom inclusion. We contribute design considerations, encouraging future work on tactile materials to enable student and TVI co-creation, facilitate rapid prototyping, and promote movement and spatial awareness. To support future research in this area, our paper provides a fundamental understanding of current practices. We bridge these practices to established pedagogical approaches and highlight opportunities for growth regarding this important genre of educational materials.
HCNov 1, 2021
Understanding the Use of Voice Assistants by Older AdultsMargot Hanley, Shiri Azenkot
Older adults are using voice-based technologies in a variety of different contexts and are uniquely positioned to benefit from smart speakers' handsfree, voice-based interface. In order to better understand the ways in which older adults engage with and learn how to use smart speakers, we conducted qualitative, semi-structured interviews with four older adults who own smart speakers. Emerging findings indicate that older adults benefit from smart speakers as both an assistive and a social technology. Findings also suggest that when older adults learn new technologies in a formal, communal environment there is successful adoption.
CYMay 26, 2021
Computer Vision and Conflicting Values: Describing People with Automated Alt TextMargot Hanley, Solon Barocas, Karen Levy et al.
Scholars have recently drawn attention to a range of controversial issues posed by the use of computer vision for automatically generating descriptions of people in images. Despite these concerns, automated image description has become an important tool to ensure equitable access to information for blind and low vision people. In this paper, we investigate the ethical dilemmas faced by companies that have adopted the use of computer vision for producing alt text: textual descriptions of images for blind and low vision people, We use Facebook's automatic alt text tool as our primary case study. First, we analyze the policies that Facebook has adopted with respect to identity categories, such as race, gender, age, etc., and the company's decisions about whether to present these terms in alt text. We then describe an alternative -- and manual -- approach practiced in the museum community, focusing on how museums determine what to include in alt text descriptions of cultural artifacts. We compare these policies, using notable points of contrast to develop an analytic framework that characterizes the particular apprehensions behind these policy choices. We conclude by considering two strategies that seem to sidestep some of these concerns, finding that there are no easy ways to avoid the normative dilemmas posed by the use of computer vision to automate alt text.
CYAug 16, 2019
Fairness Issues in AI Systems that Augment Sensory AbilitiesLeah Findlater, Steven Goodman, Yuhang Zhao et al.
Systems that augment sensory abilities are increasingly employing AI and machine learning (ML) approaches, with applications ranging from object recognition and scene description tools for blind users to sound awareness tools for d/Deaf users. However, unlike many other AI-enabled technologies, these systems provide information that is already available to non-disabled people. In this paper, we discuss unique AI fairness challenges that arise in this context, including accessibility issues with data and models, ethical implications in deciding what sensory information to convey to the user, and privacy concerns both for the primary user and for others.
HCMay 3, 2018
The Effect of Computer-Generated Descriptions on Photo-Sharing Experiences of People with Visual ImpairmentsYuhang Zhao, Shaomei Wu, Lindsay Reynolds et al.
Like sighted people, visually impaired people want to share photographs on social networking services, but find it difficult to identify and select photos from their albums. We aimed to address this problem by incorporating state-of-the-art computer-generated descriptions into Facebook's photo-sharing feature. We interviewed 12 visually impaired participants to understand their photo-sharing experiences and designed a photo description feature for the Facebook mobile application. We evaluated this feature with six participants in a seven-day diary study. We found that participants used the descriptions to recall and organize their photos, but they hesitated to upload photos without a sighted person's input. In addition to basic information about photo content, participants wanted to know more details about salient objects and people, and whether the photos reflected their personal aesthetics. We discuss these findings from the lens of self-disclosure and self-presentation theories and propose new computer vision research directions that will better support visual content sharing by visually impaired people.