CVMay 16, 2022

CONSENT: Context Sensitive Transformer for Bold Words Classification

arXiv:2205.07683v14 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses a specific problem in document analysis and image processing for tasks like text formatting recognition, but it is incremental as it adapts transformer methods to a new domain.

The paper tackles the problem of context-dependent object classification, specifically for bold words detection in images with varying fonts, languages, and conditions, achieving state-of-the-art results. It also demonstrates extensibility by applying the framework to rock-paper-scissors game winner determination with competitive performance.

We present CONSENT, a simple yet effective CONtext SENsitive Transformer framework for context-dependent object classification within a fully-trainable end-to-end deep learning pipeline. We exemplify the proposed framework on the task of bold words detection proving state-of-the-art results. Given an image containing text of unknown font-types (e.g. Arial, Calibri, Helvetica), unknown language, taken under various degrees of illumination, angle distortion and scale variation, we extract all the words and learn a context-dependent binary classification (i.e. bold versus non-bold) using an end-to-end transformer-based neural network ensemble. To prove the extensibility of our framework, we demonstrate competitive results against state-of-the-art for the game of rock-paper-scissors by training the model to determine the winner given a sequence with $2$ pictures depicting hand poses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes