Building Diversified Multiple Trees for Classification in High Dimensional Noisy Biomedical Data
This work addresses robustness in biomedical classification for noisy data, but it is incremental as it builds on existing ensemble methods.
The paper tackles the problem of classification in high-dimensional noisy biomedical data by proposing the Diversified Multiple Tree (DMT) ensemble classifier, which is shown to be significantly more accurate than benchmark methods on noisy test data across three real-world datasets.
It is common that a trained classification model is applied to the operating data that is deviated from the training data because of noise. This paper demonstrates that an ensemble classifier, Diversified Multiple Tree (DMT), is more robust in classifying noisy data than other widely used ensemble methods. DMT is tested on three real world biomedical data sets from different laboratories in comparison with four benchmark ensemble classifiers. Experimental results show that DMT is significantly more accurate than other benchmark ensemble classifiers on noisy test data. We also discuss a limitation of DMT and its possible variations.