LGAISep 16, 2023

Improve Deep Forest with Learnable Layerwise Augmentation Policy Schedule

arXiv:2309.09030v13 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses overfitting in Deep Forest models for tabular data, offering an incremental improvement with practical gains.

The paper tackles overfitting in Deep Forest models by introducing learnable layerwise data augmentation policies, achieving new state-of-the-art benchmarks in tabular classification tasks and outperforming various competitors.

As a modern ensemble technique, Deep Forest (DF) employs a cascading structure to construct deep models, providing stronger representational power compared to traditional decision forests. However, its greedy multi-layer learning procedure is prone to overfitting, limiting model effectiveness and generalizability. This paper presents an optimized Deep Forest, featuring learnable, layerwise data augmentation policy schedules. Specifically, We introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate overfitting and develop a population-based search algorithm to tailor augmentation intensity for each layer. Additionally, we propose to incorporate outputs from intermediate layers into a checkpoint ensemble for more stable performance. Experimental results show that our method sets new state-of-the-art (SOTA) benchmarks in various tabular classification tasks, outperforming shallow tree ensembles, deep forests, deep neural network, and AutoML competitors. The learned policies also transfer effectively to Deep Forest variants, underscoring its potential for enhancing non-differentiable deep learning modules in tabular signal processing.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes