AI LGJun 15, 2017

Target Curricula via Selection of Minimum Feature Sets: a Case Study in Boolean Networks

arXiv:1706.04721v23.11 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving generalization in Boolean network training for multi-label classification, particularly in domains like circuit inference and gene regulatory networks, but it is incremental as it builds on existing curriculum learning and hierarchical loss concepts.

The paper tackles the problem of training Boolean models on multi-label classification tasks by introducing a curriculum of targets based on hierarchical dependencies, resulting in up to 10% lower out-of-sample error in some cases. It also proposes a method to identify target order using intrinsic dimension and the Minimum-Feature-Set problem, achieving similar gains without prior knowledge of target difficulty.

We consider the effect of introducing a curriculum of targets when training Boolean models on supervised Multi Label Classification (MLC) problems. In particular, we consider how to order targets in the absence of prior knowledge, and how such a curriculum may be enforced when using meta-heuristics to train discrete non-linear models. We show that hierarchical dependencies between targets can be exploited by enforcing an appropriate curriculum using hierarchical loss functions. On several multi output circuit-inference problems with known target difficulties, Feedforward Boolean Networks (FBNs) trained with such a loss function achieve significantly lower out-of-sample error, up to $10\%$ in some cases. This improvement increases as the loss places more emphasis on target order and is strongly correlated with an easy-to-hard curricula. We also demonstrate the same improvements on three real-world models and two Gene Regulatory Network (GRN) inference problems. We posit a simple a-priori method for identifying an appropriate target order and estimating the strength of target relationships in Boolean MLCs. These methods use intrinsic dimension as a proxy for target difficulty, which is estimated using optimal solutions to a combinatorial optimisation problem known as the Minimum-Feature-Set (minFS) problem. We also demonstrate that the same generalisation gains can be achieved without providing any knowledge of target difficulty.

View on arXiv PDF Code

Similar