LGAIPFJun 2, 2025

Exchangeability in Neural Network and its Application to Dynamic Pruning

arXiv:2506.02210v2h-index: 3
Originality Incremental advance
AI Analysis

This addresses the computational efficiency challenge for deploying large neural networks, offering a generalizable optimization that is incremental but theory-grounded.

The paper tackles the problem of high inference costs in neural networks by introducing ExPrune, a dynamic pruning method based on exchangeability theory, which reduces FLOPs by 10.98–27.16% with minimal accuracy drops across various models.

Modern neural networks (NN) contain an ever-growing number of parameters, substantially increasing the memory and computational cost of inference. Researchers have explored various ways to reduce the inference cost of NNs by reducing the model size before deployment and dynamically pruning the inference computation at runtime. In this work, we present ExPrune, a general, dynamic pruning optimization that enables multi-granularity partial computation on a per-input basis. ExPrune requires no change to the model architecture or the training algorithm. ExPrune is based on our theoretical results that the relationship between certain model parameters and intermediate values can be described by a statistical property called exchangeability. By identifying exchangeable parameters and values in the model, we are able to first partially evaluate the network, analyze the statistics of the partial results, and make pruning decisions on the fly. Because ExPrune is theory grounded, it generalizes across model architectures in different problem domains. We evaluate ExPrune on one computer vision models, one graph model and one language model. ExPrune provides 10.98--17.33% reduction in FLOPs with negligible accuracy drop and 21.61--27.16% reduction in FLOPs with at most 1% accuracy drop. We also demonstrate that ExPrune composes with static magnitude pruning. On models that have been aggressively statically pruned, ExPrune still provides additional 10.24--11.11% reduction in FLOPs with negligible accuracy drop and 13.91--14.39% reduction in FLOPs with at most 1% accuracy drop.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes