CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence Models
This addresses a bottleneck in seq2seq models for tasks requiring explicit span copying, such as information extraction, though it is incremental in extending existing copy mechanisms.
The paper tackled the problem of seq2seq models lacking explicit alignments for copied tokens and inability to copy entire spans efficiently, resulting in a model that achieves near state-of-the-art accuracy on Nested Named Entity Recognition with an order of magnitude faster decoding speed.
Copy mechanisms are employed in sequence to sequence models (seq2seq) to generate reproductions of words from the input to the output. These frameworks, operating at the lexical type level, fail to provide an explicit alignment that records where each token was copied from. Further, they require contiguous token sequences from the input (spans) to be copied individually. We present a model with an explicit token-level copy operation and extend it to copying entire spans. Our model provides hard alignments between spans in the input and output, allowing for nontraditional applications of seq2seq, like information extraction. We demonstrate the approach on Nested Named Entity Recognition, achieving near state-of-the-art accuracy with an order of magnitude increase in decoding speed.