Not All Lotteries Are Made Equal
This work addresses the efficiency of sparse network discovery for practitioners, but it is incremental as it builds on the established Lottery Ticket Hypothesis.
The paper investigates the relationship between model size and the ease of finding sparse sub-networks under the Lottery Ticket Hypothesis, showing experimentally that smaller models benefit more from Ticket Search under a finite budget.
The Lottery Ticket Hypothesis (LTH) states that for a reasonably sized neural network, a sub-network within the same network yields no less performance than the dense counterpart when trained from the same initialization. This work investigates the relation between model size and the ease of finding these sparse sub-networks. We show through experiments that, surprisingly, under a finite budget, smaller models benefit more from Ticket Search (TS).