Interpretable by AI Mother Tongue: Native Symbolic Reasoning in Neural Models
This addresses the challenge of making neural models interpretable for AI developers and users, though it appears incremental as it builds on existing symbolic reasoning concepts.
The researchers tackled the problem of interpretability in neural models by developing a framework where models create a native symbolic language called AI Mother Tongue, which supports reasoning and interpretability; experiments showed competitive accuracy with verifiable reasoning traces.
We present a framework where neural models develop an AI Mother Tongue, a native symbolic language that simultaneously supports intuitive reasoning, compositional symbol chains, and inherent interpretability. Unlike post-hoc explanation methods, our approach embeds reasoning directly into the model's representations: symbols capture meaningful semantic patterns, chains trace decision paths, and gated induction mechanisms guide selective focus, yielding transparent yet flexible reasoning. We introduce complementary training objectives to enhance symbol purity and decision sparsity, and employ a sequential specialization strategy to first build broad symbolic competence and then refine intuitive judgments. Experiments on AI tasks demonstrate competitive accuracy alongside verifiable reasoning traces, showing that AI Mother Tongue can serve as a unified mechanism for interpretability, intuition, and symbolic reasoning in neural models.