CVLGFeb 19, 2022

Image-to-Graph Transformers for Chemical Structure Recognition

arXiv:2202.09580v116 citations
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for chemists and researchers by providing a more robust tool for extracting chemical structures from images, though it is incremental as it builds on existing methods with specific enhancements.

The paper tackles the problem of recognizing molecular structures from images in chemical literature, which is challenging due to abbreviations and style variations, and presents a deep learning model that transforms images directly into graphs, achieving relative improvements of 17.1% and 12.8% on benchmark datasets and collected images, respectively.

For several decades, chemical knowledge has been published in written text, and there have been many attempts to make it accessible, for example, by transforming such natural language text to a structured format. Although the discovered chemical itself commonly represented in an image is the most important part, the correct recognition of the molecular structure from the image in literature still remains a hard problem since they are often abbreviated to reduce the complexity and drawn in many different styles. In this paper, we present a deep learning model to extract molecular structures from images. The proposed model is designed to transform the molecular image directly into the corresponding graph, which makes it capable of handling non-atomic symbols for abbreviations. Also, by end-to-end learning approach it can fully utilize many open image-molecule pair data from various sources, and hence it is more robust to image style variation than other tools. The experimental results show that the proposed model outperforms the existing models with 17.1 % and 12.8 % relative improvement for well-known benchmark datasets and large molecular images that we collected from literature, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes