A fast and sound tagging method for discontinuous named-entity recognition
This addresses discontinuous named entity recognition in the biomedical domain, but it is incremental as it builds on existing methods with improvements in simplicity and speed.
The authors tackled the problem of discontinuous named entity recognition by introducing a novel tagging scheme based on the inner structure of mentions, using a weighted finite state automaton for inference. They reported comparable results to state-of-the-art on three English biomedical datasets, with a simpler and faster model.
We introduce a novel tagging scheme for discontinuous named entity recognition based on an explicit description of the inner structure of discontinuous mentions. We rely on a weighted finite state automaton for both marginal and maximum a posteriori inference. As such, our method is sound in the sense that (1) well-formedness of predicted tag sequences is ensured via the automaton structure and (2) there is an unambiguous mapping between well-formed sequences of tags and (discontinuous) mentions. We evaluate our approach on three English datasets in the biomedical domain, and report comparable results to state-of-the-art while having a way simpler and faster model.