An Evaluation of OCR on Egocentric Data
This addresses the challenge of text recognition in egocentric vision for applications like assistive technology, but it is incremental as it builds on pre-trained models.
The paper tackled the problem of OCR performance on egocentric data, specifically EPIC-KITCHENS images, by showing that existing methods struggle with rotated text and introducing a rotate-and-merge procedure that halves the normalized edit distance error.
In this paper, we evaluate state-of-the-art OCR methods on Egocentric data. We annotate text in EPIC-KITCHENS images, and demonstrate that existing OCR methods struggle with rotated text, which is frequently observed on objects being handled. We introduce a simple rotate-and-merge procedure which can be applied to pre-trained OCR models that halves the normalized edit distance error. This suggests that future OCR attempts should incorporate rotation into model design and training procedures.