AIApr 5

A Model of Understanding in Deep Learning Systems

arXiv:2604.041711.3

AI Analysis

This addresses the philosophical and practical problem of evaluating understanding in AI for researchers and developers, but it is incremental as it builds on existing concepts without introducing new methods or data.

The paper tackles the problem of defining systematic understanding in deep learning systems, proposing a model where agents achieve understanding through internal models that track real regularities, and argues that current systems often meet this but fall short of scientific ideals, termed the Fractured Understanding Hypothesis.

I propose a model of systematic understanding, suitable for machine learning systems. On this account, an agent understands a property of a target system when it contains an adequate internal model that tracks real regularities, is coupled to the target by stable bridge principles, and supports reliable prediction. I argue that contemporary deep learning systems often can and do achieve such understanding. However they generally fall short of the ideal of scientific understanding: the understanding is symbolically misaligned with the target system, not explicitly reductive, and only weakly unifying. I label this the Fractured Understanding Hypothesis.

View on arXiv PDF

Similar