Self-supervised Representation Learning with Relative Predictive Coding
This work addresses a key bottleneck in self-supervised learning for researchers and practitioners, though it is incremental in improving existing contrastive methods.
The paper tackles the problem of training instability and minibatch size sensitivity in contrastive representation learning by introducing Relative Predictive Coding (RPC), which achieves competitive performance on vision and speech benchmarks and can estimate mutual information with low variance.
This paper introduces Relative Predictive Coding (RPC), a new contrastive representation learning objective that maintains a good balance among training stability, minibatch size sensitivity, and downstream task performance. The key to the success of RPC is two-fold. First, RPC introduces the relative parameters to regularize the objective for boundedness and low variance. Second, RPC contains no logarithm and exponential score functions, which are the main cause of training instability in prior contrastive objectives. We empirically verify the effectiveness of RPC on benchmark vision and speech self-supervised learning tasks. Lastly, we relate RPC with mutual information (MI) estimation, showing RPC can be used to estimate MI with low variance.