LG CLMay 13

LIFT: Last-Mile Fine-Tuning for Table Explicitation

arXiv:2605.1342413.6

Predicted impact top 34% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For practitioners needing accurate table extraction with limited training data, LIFT offers a more data-efficient and robust alternative to end-to-end fine-tuning.

LIFT proposes a two-stage pipeline where a large language model extracts an initial table from unstructured text, and a fine-tuned small language model repairs errors. On 2,596 tables, it matches or exceeds end-to-end fine-tuning on TEDS while requiring as few as 1,000 training examples, outperforming end-to-end by up to 0.144 TEDS points.

We propose last-mile fine-tuning, or Lift, a pipeline in which a pre-trained large language model extracts an initial table from unstructured clipboard text, and a fine-tuned small language model (1B-24B parameters SLM) repairs errors in the extracted table. On a benchmark of 2,596 tables from three datasets, Lift matches or exceeds end-to-end SLM fine-tuning on tree-edit-distance-based similarity (TEDS) metric while requiring as little as 1,000 training examples - where it outperforms end-to-end fine-tuning by up to 0.144 TEDS points. We term this approach last-mile fine-tuning and show it also more robust to input format variability. Comparisons with self-debug and end-to-end fine-tuning approaches show that last-mile fine-tuning provides an attractive option when training data is limited or when robustness to input variation is sought without compromising on accuracy.

View on arXiv PDF

Similar