DTization: A New Method for Supervised Feature Scaling
This addresses a data pre-processing bottleneck for machine learning practitioners, but it is incremental as it builds on existing scaling techniques.
The paper tackles the problem of traditional unsupervised feature scaling by introducing DTization, a supervised method using decision trees and robust scaler, which shows noteworthy performance improvements on ten classification and regression datasets.
Artificial intelligence is currently a dominant force in shaping various aspects of the world. Machine learning is a sub-field in artificial intelligence. Feature scaling is one of the data pre-processing techniques that improves the performance of machine learning algorithms. The traditional feature scaling techniques are unsupervised where they do not have influence of the dependent variable in the scaling process. In this paper, we have presented a novel feature scaling technique named DTization that employs decision tree and robust scaler for supervised feature scaling. The proposed method utilizes decision tree to measure the feature importance and based on the importance, different features get scaled differently with the robust scaler algorithm. The proposed method has been extensively evaluated on ten classification and regression datasets on various evaluation matrices and the results show a noteworthy performance improvement compared to the traditional feature scaling methods.