CL AIMay 12, 2025

Categorical Classification of Book Summaries Using Word Embedding Techniques

arXiv:2507.21058v1h-index: 5

Originality Synthesis-oriented

AI Analysis

This addresses text classification for Turkish book summaries, but it is incremental as it applies standard NLP methods to a specific language dataset.

The study tackled book summary classification by comparing word embedding techniques (TF-IDF, Word2Vec, one-hot encoding) with machine learning models, finding that Support Vector Machine, Naive Bayes, and Logistic Regression combined with TF-IDF and one-hot encoding performed best for Turkish texts.

In this study, book summaries and categories taken from book sites were classified using word embedding methods, natural language processing techniques and machine learning algorithms. In addition, one hot encoding, Word2Vec and Term Frequency - Inverse Document Frequency (TF-IDF) methods, which are frequently used word embedding methods were used in this study and their success was compared. Additionally, the combination table of the pre-processing methods used is shown and added to the table. Looking at the results, it was observed that Support Vector Machine, Naive Bayes and Logistic Regression Models and TF-IDF and One-Hot Encoder word embedding techniques gave more successful results for Turkish texts.

View on arXiv PDF

Similar