Semantic classifier approach to document classification
This addresses the problem of semantic discrepancies in document classification for text analysis applications, but appears incremental as it builds on existing categorization and ensemble techniques.
The paper tackles the semantic gap between training and application sets in document classification by proposing a method that combines document categorization with a classifier or ensemble, demonstrating superiority over classical approaches including traditional classifier ensembles.
In this paper we propose a new document classification method, bridging discrepancies (so-called semantic gap) between the training set and the application sets of textual data. We demonstrate its superiority over classical text classification approaches, including traditional classifier ensembles. The method consists in combining a document categorization technique with a single classifier or a classifier ensemble (SEMCOM algorithm - Committee with Semantic Categorizer).