CV LGApr 5, 2025

The Effects of Grouped Structural Global Pruning of Vision Transformers on Domain Generalisation

arXiv:2504.04196v13.6h-index: 66

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient deployment of vision transformers for domain generalization, which is incremental as it applies a novel pruning technique to existing models.

The paper tackles the challenge of deploying large vision transformers on resource-limited devices for domain generalization tasks by introducing a grouped structural pruning method. Results show that pruning ViT, BeiT, and DeiT models by 50% using the Hessian metric achieved speed boosts of 2.5x, 1.81x, and 2.15x with accuracy drops of only -2.94%, -1.42%, and -1.72% on the PACS benchmark.

With the growing sizes of AI models like large language models (LLMs) and vision transformers, deploying them on devices with limited computational resources is a significant challenge particularly when addressing domain generalisation (DG) tasks. This paper introduces a novel grouped structural pruning method for pre-trained vision transformers (ViT, BeiT, and DeiT), evaluated on the PACS and Office-Home DG benchmarks. Our method uses dependency graph analysis to identify and remove redundant groups of neurons, weights, filters, or attention heads within transformers, using a range of selection metrics. Grouped structural pruning is applied at pruning ratios of 50\%, 75\% and 95\% and the models are then fine-tuned on selected distributions from DG benchmarks to evaluate their overall performance in DG tasks. Results show significant improvements in inference speed and fine-tuning time with minimal trade-offs in accuracy and DG task performance. For instance, on the PACS benchmark, pruning ViT, BeiT, and DeiT models by 50\% using the Hessian metric resulted in accuracy drops of only -2.94\%, -1.42\%, and -1.72\%, respectively, while achieving speed boosts of 2.5x, 1.81x, and 2.15x. These findings demonstrate the effectiveness of our approach in balancing model efficiency with domain generalisation performance.

View on arXiv PDF

Similar