CLJun 24, 2025

Learning to Disentangle Latent Reasoning Rules with Language VAEs: A Systematic Study

Yingji Zhang, Marco Valentino, Danilo S. Carvalho, André Freitas

arXiv:2506.19418v22 citationsh-index: 7

Originality Incremental advance

AI Analysis

This addresses the challenge of making language models more interpretable and controllable for NLP researchers, though it appears incremental as it builds on existing language VAE frameworks.

This work tackles the problem of enabling language models to perform rule-based reasoning rather than memorization by embedding explicit reasoning rules in the latent space of Transformer-based language VAEs, finding that reasoning rules can be disentangled into distinct clusters and that prior knowledge injection improves retrieval from memory.

Incorporating explicit reasoning rules within the latent space of language models (LMs) offers a promising pathway to enhance generalisation, interpretability, and controllability. While current Transformer-based language models have shown strong performance on Natural Language Inference (NLI) tasks, they often rely on memorisation rather than rule-based inference. This work investigates how reasoning rules can be explicitly embedded and memorised within the LMs through Language Variational Autoencoders (VAEs). We propose a complete pipeline for learning reasoning rules within Transformer-based language VAEs. This pipeline encompasses three rule-based reasoning tasks, a supporting theoretical framework, and a practical end-to-end architecture. The experiment illustrates the following findings: Disentangled reasoning: Under explicit signal supervision, reasoning rules - viewed as functional mappings - can be disentangled within the encoder's parametric space. This separation results in distinct clustering of rules in the output feature space. Prior knowledge injection: injecting reasoning information into the Query enables the model to more effectively retrieve the stored value Value from memory based on Key. This approach offers a simple method for integrating prior knowledge into decoder-only language models. Performance bottleneck: In mathematical reasoning tasks using Qwen2.5(0.5B), increasing sample count doesn't improve performance beyond a point. Moreover, ffn layers are better than attention layers at preserving the separation of reasoning rules in the model's parameters.

View on arXiv PDF

Similar