CVMar 19, 2024

Total Disentanglement of Font Images into Style and Character Class Features

arXiv:2403.12784v21 citationsHas CodeICDAR
AI Analysis

This addresses the challenge of feature separation in font images for applications like recognition and generation, representing a novel method rather than an incremental improvement.

The paper tackles the problem of disentangling font images into separate style and character class features, achieving very high accuracy and providing experimental proof for the existence of 'A'-ness as an open question.

In this paper, we demonstrate a total disentanglement of font images. Total disentanglement is a neural network-based method for decomposing each font image nonlinearly and completely into its style and content (i.e., character class) features. It uses a simple but careful training procedure to extract the common style feature from all `A'-`Z' images in the same font and the common content feature from all `A' (or another class) images in different fonts. These disentangled features guarantee the reconstruction of the original font image. Various experiments have been conducted to understand the performance of total disentanglement. First, it is demonstrated that total disentanglement is achievable with very high accuracy; this is experimental proof of the long-standing open question, ``Does `A'-ness exist?'' Hofstadter (1985). Second, it is demonstrated that the disentangled features produced by total disentanglement apply to a variety of tasks, including font recognition, character recognition, and one-shot font image generation. Code is available here: https://github.com/uchidalab/total_disentanglement

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes