CLOct 10, 2021

DCT: Dynamic Compressive Transformer for Modeling Unbounded Sequence

arXiv:2110.04821v10.2

Originality Incremental advance

AI Analysis

This addresses the challenge of handling unlimited long sequences for natural language processing tasks, though it appears incremental as it builds on existing transformer-based approaches.

The paper tackles the problem of modeling unbounded sequences by proposing Dynamic Compressive Transformer (DCT), which conditionally selects and compresses sentence representations in memory instead of appending all of them. The results show that DCT outperforms the previous state-of-the-art model on the Enwik8 benchmark.

In this paper, we propose Dynamic Compressive Transformer (DCT), a transformer-based framework for modeling the unbounded sequence. In contrast to the previous baselines which append every sentence representation to memory, conditionally selecting and appending them is a more reasonable solution to deal with unlimited long sequences. Our model uses a policy that determines whether the sequence should be kept in memory with a compressed state or discarded during the training process. With the benefits of retaining semantically meaningful sentence information in the memory system, our experiment results on Enwik8 benchmark show that DCT outperforms the previous state-of-the-art (SOTA) model.

View on arXiv PDF

Similar