A Context-Aware Feature Fusion Framework for Punctuation Restoration
This work addresses a specific bottleneck in punctuation restoration for natural language processing applications, representing an incremental improvement.
The paper tackles the problem of attention dilution in transformer-based models for punctuation restoration by proposing a novel Feature Fusion framework based on two-type Attentions (FFA), which achieves comparable performance to state-of-the-art models on the IWSLT benchmark without extra data.
To accomplish the punctuation restoration task, most existing approaches focused on leveraging extra information (e.g., part-of-speech tags) or addressing the class imbalance problem. Recent works have widely applied the transformer-based language models and significantly improved their effectiveness. To the best of our knowledge, an inherent issue has remained neglected: the attention of individual heads in the transformer will be diluted or powerless while feeding the long non-punctuation utterances. Since those previous contexts, not the followings, are comparatively more valuable to the current position, it's hard to achieve a good balance by independent attention. In this paper, we propose a novel Feature Fusion framework based on two-type Attentions (FFA) to alleviate the shortage. It introduces a two-stream architecture. One module involves interaction between attention heads to encourage the communication, and another masked attention module captures the dependent feature representation. Then, it aggregates two feature embeddings to fuse information and enhances context-awareness. The experiments on the popular benchmark dataset IWSLT demonstrate that our approach is effective. Without additional data, it obtains comparable performance to the current state-of-the-art models.