LGJan 29, 2015

Efficient Divide-And-Conquer Classification Based on Feature-Space Decomposition

Qi Guo, Bo-Wei Chen, Feng Jiang, Xiangyang Ji, Sun-Yuan Kung

arXiv:1501.07584v12.110 citations

Originality Incremental advance

AI Analysis

This work addresses overfitting in large-scale classification for machine learning practitioners, presenting an incremental improvement over existing methods.

The paper tackled the problem of overfitting in large-scale classification by proposing a divide-and-conquer approach based on feature-space decomposition, resulting in error rate reductions of 10.53% and 7.53% on RCV1 and covtype datasets compared to state-of-the-art fast SVM solvers.

This study presents a divide-and-conquer (DC) approach based on feature space decomposition for classification. When large-scale datasets are present, typical approaches usually employed truncated kernel methods on the feature space or DC approaches on the sample space. However, this did not guarantee separability between classes, owing to overfitting. To overcome such problems, this work proposes a novel DC approach on feature spaces consisting of three steps. Firstly, we divide the feature space into several subspaces using the decomposition method proposed in this paper. Subsequently, these feature subspaces are sent into individual local classifiers for training. Finally, the outcomes of local classifiers are fused together to generate the final classification results. Experiments on large-scale datasets are carried out for performance evaluation. The results show that the error rates of the proposed DC method decreased comparing with the state-of-the-art fast SVM solvers, e.g., reducing error rates by 10.53% and 7.53% on RCV1 and covtype datasets respectively.

View on arXiv PDF

Similar