On Dimension-Free Transformer: An Application of STP to AI
This is an incremental improvement for AI researchers and practitioners working with transformer architectures, focusing on flexibility in dimension handling.
The paper tackles the problem of transformer models requiring fixed input/output dimensions by proposing a dimension-free transformer (DFT) framework using projection-based transformation of hypervectors, which allows arbitrary dimensions and claims improved efficiency in signal processing.
The matrix expressions for every parts of a transformer are firstly described. Based on semi-tensor product (STP) of matrices the hypervectors are reconsidered and the linear transformation over hypervectors is constructed by using projection. Its properties and calculating formulas are obtained. Using projection-based transformation of hypervector (PBTH), the framework of dimension-free transformer (DFT) is proposed by verifying each linear transformation in a transformer and replacing it by a proper PBTH, which allows the inputs and outputs being of arbitrary dimensions. Using balanced information about all entries, DFT must be more efficient in dealing with signals.