Docling Technical Report
This provides a tool for developers and researchers needing accessible PDF conversion, but it is incremental as it builds on existing models like DocLayNet and TableFormer.
The authors tackled the problem of converting PDF documents by introducing Docling, an open-source package that uses specialized AI models for layout analysis and table recognition, achieving efficient performance on commodity hardware.
This technical report introduces Docling, an easy to use, self-contained, MIT-licensed open-source package for PDF document conversion. It is powered by state-of-the-art specialized AI models for layout analysis (DocLayNet) and table structure recognition (TableFormer), and runs efficiently on commodity hardware in a small resource budget. The code interface allows for easy extensibility and addition of new features and models.