Enriching BERT with Knowledge Graph Embeddings for Document Classification
This work addresses document classification for books, offering incremental improvements by integrating metadata with text representations.
The paper tackled book classification using cover blurbs and metadata by enhancing BERT with knowledge graph embeddings for author information, achieving an F1-score of 87.20 for coarse-grained classification and 64.70 for detailed classification.
In this paper, we focus on the classification of books using short descriptive texts (cover blurbs) and additional metadata. Building upon BERT, a deep neural language model, we demonstrate how to combine text representations with metadata and knowledge graph embeddings, which encode author information. Compared to the standard BERT approach we achieve considerably better results for the classification task. For a more coarse-grained classification using eight labels we achieve an F1- score of 87.20, while a detailed classification using 343 labels yields an F1-score of 64.70. We make the source code and trained models of our experiments publicly available