NE LGDec 23, 2014

Deep Networks With Large Output Spaces

Sudheendra Vijayanarasimhan, Jonathon Shlens, Rajat Monga, Jay Yagnik

arXiv:1412.7479v457 citations

Originality Incremental advance

AI Analysis

This addresses a scalability bottleneck for researchers and practitioners working on large-scale classification problems, though it appears incremental as it builds on existing hashing methods.

The paper tackles the problem of training deep neural networks with millions of output classes, which is prohibitively expensive, by proposing a fast locality-sensitive hashing technique to approximate dot products, enabling faster training and inference on three large-scale recognition tasks.

Deep neural networks have been extremely successful at various image, speech, video recognition tasks because of their ability to model deep structures within the data. However, they are still prohibitively expensive to train and apply for problems containing millions of classes in the output layer. Based on the observation that the key computation common to most neural network layers is a vector/matrix product, we propose a fast locality-sensitive hashing technique to approximate the actual dot product enabling us to scale up the training and inference to millions of output classes. We evaluate our technique on three diverse large-scale recognition tasks and show that our approach can train large-scale models at a faster rate (in terms of steps/total time) compared to baseline methods.

View on arXiv PDF

Similar