The Relativity of Induction
This work addresses foundational issues in machine learning theory for researchers, but it appears incremental as it builds on existing discussions without introducing a new paradigm or method.
The paper tackles the problem of understanding why deep learning algorithms generalize better than theoretical expectations by challenging traditional principles like Occam's razor, proposing relativistic principles that show simplicity is contingent and generalization is relative to initial guesses, but it does not provide concrete numerical results.
Lately there has been a lot of discussion about why deep learning algorithms perform better than we would theoretically suspect. To get insight into this question, it helps to improve our understanding of how learning works. We explore the core problem of generalization and show that long-accepted Occam's razor and parsimony principles are insufficient to ground learning. Instead, we derive and demonstrate a set of relativistic principles that yield clearer insight into the nature and dynamics of learning. We show that concepts of simplicity are fundamentally contingent, that all learning operates relative to an initial guess, and that generalization cannot be measured or strongly inferred, but that it can be expected given enough observation. Using these principles, we reconstruct our understanding in terms of distributed learning systems whose components inherit beliefs and update them. We then apply this perspective to elucidate the nature of some real world inductive processes including deep learning.