CVIROct 24, 2014

Detecting Figures and Part Labels in Patents: Competition-Based Development of Image Processing Algorithms

arXiv:1410.6751v313 citations
Originality Synthesis-oriented
AI Analysis

This addresses the need for automated image processing in patent documents for researchers and practitioners, but it is incremental as it builds on existing methods in a competition setting.

The paper tackled the problem of detecting figures and part labels in U.S. patent drawing pages through a competition, resulting in top systems achieving f-measures of 88.57% for figure detection and 70.98% for part label detection.

We report the findings of a month-long online competition in which participants developed algorithms for augmenting the digital version of patent documents published by the United States Patent and Trademark Office (USPTO). The goal was to detect figures and part labels in U.S. patent drawing pages. The challenge drew 232 teams of two, of which 70 teams (30%) submitted solutions. Collectively, teams submitted 1,797 solutions that were compiled on the competition servers. Participants reported spending an average of 63 hours developing their solutions, resulting in a total of 5,591 hours of development time. A manually labeled dataset of 306 patents was used for training, online system tests, and evaluation. The design and performance of the top-5 systems are presented, along with a system developed after the competition which illustrates that winning teams produced near state-of-the-art results under strict time and computation constraints. For the 1st place system, the harmonic mean of recall and precision (f-measure) was 88.57% for figure region detection, 78.81% for figure regions with correctly recognized figure titles, and 70.98% for part label detection and character recognition. Data and software from the competition are available through the online UCI Machine Learning repository to inspire follow-on work by the image processing community.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes