LineFormer: Rethinking Line Chart Data Extraction as Instance Segmentation
This addresses the need for robust automated document understanding in data visualization, though it appears incremental as it builds on instance segmentation methods.
The paper tackles the problem of extracting data from line-chart images, which is challenging due to visual and structural variations, by proposing LineFormer, an instance segmentation approach that achieves state-of-the-art performance on benchmark synthetic and real datasets.
Data extraction from line-chart images is an essential component of the automated document understanding process, as line charts are a ubiquitous data visualization format. However, the amount of visual and structural variations in multi-line graphs makes them particularly challenging for automated parsing. Existing works, however, are not robust to all these variations, either taking an all-chart unified approach or relying on auxiliary information such as legends for line data extraction. In this work, we propose LineFormer, a robust approach to line data extraction using instance segmentation. We achieve state-of-the-art performance on several benchmark synthetic and real chart datasets. Our implementation is available at https://github.com/TheJaeLal/LineFormer .