A picture of the space of typical learnable tasks
This work provides insights into task relationships and generalization in machine learning, though it appears incremental as it builds on existing methods to analyze known phenomena.
The paper investigates the structure of the space of learnable tasks using information geometry, revealing that representations from various learning methods are low-dimensional and that supervised learning on one task can generalize to dissimilar tasks, with experiments on CIFAR-10 and ImageNet datasets.
We develop information geometric techniques to understand the representations learned by deep networks when they are trained on different tasks using supervised, meta-, semi-supervised and contrastive learning. We shed light on the following phenomena that relate to the structure of the space of tasks: (1) the manifold of probabilistic models trained on different tasks using different representation learning methods is effectively low-dimensional; (2) supervised learning on one task results in a surprising amount of progress even on seemingly dissimilar tasks; progress on other tasks is larger if the training task has diverse classes; (3) the structure of the space of tasks indicated by our analysis is consistent with parts of the Wordnet phylogenetic tree; (4) episodic meta-learning algorithms and supervised learning traverse different trajectories during training but they fit similar models eventually; (5) contrastive and semi-supervised learning methods traverse trajectories similar to those of supervised learning. We use classification tasks constructed from the CIFAR-10 and Imagenet datasets to study these phenomena.