A Strong Inductive Bias: Gzip for binary image classification
This work offers a computationally cheaper and simpler alternative for few-shot image classification, though it is incremental as it adapts existing NLP methods to vision.
The authors tackled binary image classification by proposing a nearest neighbor classifier with Gzip compression, achieving better accuracy and over two orders of magnitude less space usage compared to deep learning networks like ResNet in few-shot settings.
Deep learning networks have become the de-facto standard in Computer Vision for industry and research. However, recent developments in their cousin, Natural Language Processing (NLP), have shown that there are areas where parameter-less models with strong inductive biases can serve as computationally cheaper and simpler alternatives. We propose such a model for binary image classification: a nearest neighbor classifier combined with a general purpose compressor like Gzip. We test and compare it against popular deep learning networks like Resnet, EfficientNet and Mobilenet and show that it achieves better accuracy and utilizes significantly less space, more than two order of magnitude, within a few-shot setting. As a result, we believe that this underlines the untapped potential of models with stronger inductive biases in few-shot scenarios.