Vision Transformer Pruning Via Matrix Decomposition
This work is incremental, as it builds on existing Vision Transformer pruning techniques by comparing and selecting matrix decomposition methods to further optimize model efficiency.
The paper tackles the problem of reducing Vision Transformer model size and computational demands by applying matrix decomposition methods to prune linear projections, achieving comparable accuracy to the original model while reducing dimensions.
This is a further development of Vision Transformer Pruning via matrix decomposition. The purpose of the Vision Transformer Pruning is to prune the dimension of the linear projection of the dataset by learning their associated importance score in order to reduce the storage, run-time memory, and computational demands. In this paper we further reduce dimension and complexity of the linear projection by implementing and comparing several matrix decomposition methods while preserving the generated important features. We end up selected the Singular Value Decomposition as the method to achieve our goal by comparing the original accuracy scores in the original Github repository and the accuracy scores of using those matrix decomposition methods, including Singular Value Decomposition, four versions of QR Decomposition, and LU factorization.