Sublinear Partition Estimation
This addresses efficiency issues in large-scale classification tasks like visual object recognition or discriminative language modeling, but it is incremental as it builds on existing approximation techniques.
The paper tackles the problem of computing the partition function in neural network classifiers, which scales linearly with the number of categories, by proposing three sublinear estimation methods based on approximate nearest neighbor search and kernel feature maps, and compares their performance empirically.
The output scores of a neural network classifier are converted to probabilities via normalizing over the scores of all competing categories. Computing this partition function, $Z$, is then linear in the number of categories, which is problematic as real-world problem sets continue to grow in categorical types, such as in visual object recognition or discriminative language modeling. We propose three approaches for sublinear estimation of the partition function, based on approximate nearest neighbor search and kernel feature maps and compare the performance of the proposed approaches empirically.