Dynamics and Reachability of Learning Tasks
This work addresses a fundamental challenge in machine learning by providing theoretical insights into when transfer learning is feasible, which is crucial for practitioners aiming to reuse models across tasks.
The paper tackles the problem of predicting success in transfer learning by decomposing the transition probability between tasks into geometric and dynamic factors, showing that nearby tasks can be unreachable via fine-tuning and deriving strict lower bounds on learning complexity.
We compute the transition probability between two learning tasks, and show that it decomposes into two factors. The first depends on the geometry of the loss landscape of a model trained on each task, independent of any particular model used. This is related to an information theoretic distance function, but is insufficient to predict success in transfer learning, as nearby tasks can be unreachable via fine-tuning. The second factor depends on the ease of traversing the path between two tasks. With this dynamic component, we derive strict lower bounds on the complexity necessary to learn a task starting from the solution to another, which is one of the most common forms of transfer learning.