CLJun 19, 2024

Fine-Tuning BERTs for Definition Extraction from Mathematical Text

arXiv:2406.13827v21 citations
Originality Synthesis-oriented
AI Analysis

This work addresses definition extraction for mathematical text processing, but it is incremental as it applies existing fine-tuning methods to new datasets.

The paper tackled the problem of extracting definitions from mathematical text by fine-tuning BERT models for binary classification, achieving comparable results to earlier models with less computational effort, as measured by accuracy, recall, and precision metrics.

In this paper, we fine-tuned three pre-trained BERT models on the task of "definition extraction" from mathematical English written in LaTeX. This is presented as a binary classification problem, where either a sentence contains a definition of a mathematical term or it does not. We used two original data sets, "Chicago" and "TAC," to fine-tune and test these models. We also tested on WFMALL, a dataset presented by Vanetik and Litvak in 2021 and compared the performance of our models to theirs. We found that a high-performance Sentence-BERT transformer model performed best based on overall accuracy, recall, and precision metrics, achieving comparable results to the earlier models with less computational effort.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes