CLJan 18, 2021

Automatic punctuation restoration with BERT models

arXiv:2101.07343v12.828 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses punctuation restoration for English and Hungarian, presenting an incremental improvement with specific performance gains.

The paper tackled automatic punctuation restoration for English and Hungarian using BERT models, achieving macro-averaged F1-scores of 79.8 on the Ted Talks benchmark and 82.2 on the Szeged Treebank dataset.

We present an approach for automatic punctuation restoration with BERT models for English and Hungarian. For English, we conduct our experiments on Ted Talks, a commonly used benchmark for punctuation restoration, while for Hungarian we evaluate our models on the Szeged Treebank dataset. Our best models achieve a macro-averaged $F_1$-score of 79.8 in English and 82.2 in Hungarian. Our code is publicly available.

View on arXiv PDF Code

Similar