CVHCJun 21, 2022

Towards Optimizing OCR for Accessibility

arXiv:2206.10254v21 citationsh-index: 56
Originality Synthesis-oriented
AI Analysis

This addresses accessibility issues for blind and print-disabled individuals by improving OCR-based listening experiences, though it appears incremental as it builds on existing OCR and text-to-speech methods.

The paper tackled the problem of OCR and text-to-speech software ignoring visual cues like structure and emphasis, which hinders accessibility for blind and print-disabled individuals, and found that preserving one or two visual cues in aural form significantly enhances the listening experience.

Visual cues such as structure, emphasis, and icons play an important role in efficient information foraging by sighted individuals and make for a pleasurable reading experience. Blind, low-vision and other print-disabled individuals miss out on these cues since current OCR and text-to-speech software ignore them, resulting in a tedious reading experience. We identify four semantic goals for an enjoyable listening experience, and identify syntactic visual cues that help make progress towards these goals. Empirically, we find that preserving even one or two visual cues in aural form significantly enhances the experience for listening to print content.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes