S$^3$: Structured Sparsity Specification
For researchers and practitioners in model compression, S$^3$ provides a unified framework to specify and implement structured sparsity, but the novelty is incremental as it formalizes existing concepts.
The paper introduces S$^3$, an algebraic framework for specifying structured sparse patterns, enabling precise composition of diverse sparsity structures. Experimental validation shows that structured OBS and OBD implementations built on S$^3$ surpass well-established second-order heuristics on output reconstruction across common configurations.
We introduce the Structured Sparsity Specification (S$^3$), an algebraic framework for defining, composing, and implementing structured sparse patterns. S$^3$ specifies sparsity through three components: a View that reshapes the tensor via layout composition, a Block specification that defines the atomic pruning unit, and the sparsity decision Scope. Both Block and Scope support Coupling across tensors for coordinated sparsification. S$^3$ enables precise specification of diverse sparsity structures, from fine-grained N:M patterns to coarse channel pruning, and integrates seamlessly with Optimal Brain Damage (OBD) and Surgeon (OBS). We formalize the framework mathematically, demonstrate its expressiveness on canonical patterns, and validate it experimentally via structured OBS and OBD implementations built entirely on S$^3$, which surpasses well-established second order heuristics on output reconstruction across common configurations.