Structured Multi-Label Biomedical Text Tagging via Attentive Neural Tree Decoding
This addresses the challenge of efficient and accurate biomedical text annotation for researchers and practitioners, representing an incremental improvement over existing methods.
The paper tackles the problem of tagging biomedical texts with multiple terms from a tree-structured ontology, proposing a model that outperforms state-of-the-art approaches on automatically assigning MeSH terms to abstracts.
We propose a model for tagging unstructured texts with an arbitrary number of terms drawn from a tree-structured vocabulary (i.e., an ontology). We treat this as a special case of sequence-to-sequence learning in which the decoder begins at the root node of an ontological tree and recursively elects to expand child nodes as a function of the input text, the current node, and the latent decoder state. In our experiments the proposed method outperforms state-of-the-art approaches on the important task of automatically assigning MeSH terms to biomedical abstracts.