DOC: Deep Open Classification of Text Documents
It addresses the challenge of handling novel documents in dynamic environments for text classification, which is an incremental improvement over traditional closed-world methods.
The paper tackles the problem of open-world text classification, where test documents may belong to classes not seen during training, and proposes a deep learning approach that dramatically outperforms existing state-of-the-art techniques.
Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.