LG AI MLMay 25, 2022

Rethinking Fano's Inequality in Ensemble Learning

Terufumi Morishita, Gaku Morio, Shota Horiguchi, Hiroaki Ozaki, Nobuo Nukaga

arXiv:2205.12683v23.34 citationsh-index: 23Has Code

Originality Incremental advance

AI Analysis

This work provides a refined theoretical framework for ensemble learning, offering insights for system design, though it is incremental as it builds on prior variants of Fano's inequality.

The authors tackled the problem of understanding what makes ensemble learning systems effective by revisiting Fano's inequality to incorporate information loss during prediction combination, revealing system strengths and weaknesses through empirical validation.

We propose a fundamental theory on ensemble learning that answers the central question: what factors make an ensemble system good or bad? Previous studies used a variant of Fano's inequality of information theory and derived a lower bound of the classification error rate on the basis of the $\textit{accuracy}$ and $\textit{diversity}$ of models. We revisit the original Fano's inequality and argue that the studies did not take into account the information lost when multiple model predictions are combined into a final prediction. To address this issue, we generalize the previous theory to incorporate the information loss, which we name $\textit{combination loss}$. Further, we empirically validate and demonstrate the proposed theory through extensive experiments on actual systems. The theory reveals the strengths and weaknesses of systems on each metric, which will push the theoretical understanding of ensemble learning and give us insights into designing systems.

View on arXiv PDF Code

Similar