Text Classification Algorithms: A Survey
It offers a comprehensive overview for researchers and practitioners facing challenges in selecting and applying text classification techniques, but it is incremental as it summarizes existing work without introducing new methods.
This paper provides a survey of text classification algorithms, discussing various feature extraction, dimensionality reduction, and evaluation methods, along with their limitations and real-world applications.
In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine learning approaches have achieved surpassing results in natural language processing. The success of these learning algorithms relies on their capacity to understand complex models and non-linear relationships within data. However, finding suitable structures, architectures, and techniques for text classification is a challenge for researchers. In this paper, a brief overview of text classification algorithms is discussed. This overview covers different text feature extractions, dimensionality reduction methods, existing algorithms and techniques, and evaluations methods. Finally, the limitations of each technique and their application in the real-world problem are discussed.