LGAIJun 6, 2024

Weight-based Decomposition: A Case for Bilinear MLPs

arXiv:2406.03947v12 citations
Originality Incremental advance
AI Analysis

This work addresses interpretability for researchers and practitioners using bilinear layers in ML models, but it is incremental as it builds on existing bilinear architectures.

The paper tackles the problem of interpretability in neural networks by developing a method to decompose bilinear layers into sparsely interacting eigenvectors, showing promising interpretability properties in preliminary experiments on shallow image classifiers (MNIST) and small language models (Tiny Stories).

Gated Linear Units (GLUs) have become a common building block in modern foundation models. Bilinear layers drop the non-linearity in the "gate" but still have comparable performance to other GLUs. An attractive quality of bilinear layers is that they can be fully expressed in terms of a third-order tensor and linear operations. Leveraging this, we develop a method to decompose the bilinear tensor into a set of sparsely interacting eigenvectors that show promising interpretability properties in preliminary experiments for shallow image classifiers (MNIST) and small language models (Tiny Stories). Since the decomposition is fully equivalent to the model's original computations, bilinear layers may be an interpretability-friendly architecture that helps connect features to the model weights. Application of our method may not be limited to pretrained bilinear models since we find that language models such as TinyLlama-1.1B can be finetuned into bilinear variants.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes