Does a Hybrid Neural Network based Feature Selection Model Improve Text Classification?
This is an incremental improvement for text classification tasks in natural language processing.
The paper tackles the problem of redundant features in text classification by proposing a hybrid feature selection model combining filter-based methods with a fastText classifier, resulting in reduced training time and a slight accuracy increase on some datasets.
Text classification is a fundamental problem in the field of natural language processing. Text classification mainly focuses on giving more importance to all the relevant features that help classify the textual data. Apart from these, the text can have redundant or highly correlated features. These features increase the complexity of the classification algorithm. Thus, many dimensionality reduction methods were proposed with the traditional machine learning classifiers. The use of dimensionality reduction methods with machine learning classifiers has achieved good results. In this paper, we propose a hybrid feature selection method for obtaining relevant features by combining various filter-based feature selection methods and fastText classifier. We then present three ways of implementing a feature selection and neural network pipeline. We observed a reduction in training time when feature selection methods are used along with neural networks. We also observed a slight increase in accuracy on some datasets.