PGKET: A Photonic Gaussian Kernel Enhanced Transformer
This work addresses efficiency issues in transformers for long sequences, potentially benefiting photonic computing and machine learning applications, but appears incremental as it builds on existing transformer and photonic methods.
The paper tackles the inefficiency of self-attention mechanisms in transformers for long sequences by proposing PGKET, which uses a photonic Gaussian kernel self-attention mechanism based on photon interferometry and superposition. Experimental results show it outperforms some state-of-the-art transformers in multi-classification tasks on MedMNIST v2 and CIFAR-10.
Self-Attention Mechanisms (SAMs) enhance model performance by extracting key information but are inefficient when dealing with long sequences. To this end, a photonic Gaussian Kernel Enhanced Transformer (PGKET) is proposed, based on the Photonic Gaussian Kernel Self-Attention Mechanism (PGKSAM). The PGKSAM calculates the Photonic Gaussian Kernel Self-Attention Score (PGKSAS) using photon interferometry and superposition to process multiple inputs in parallel. Experimental results show that PGKET outperforms some state-of-the-art transformers in multi-classification tasks on MedMNIST v2 and CIFAR-10, and is expected to improve performance in complex tasks and accelerate the convergence of Photonic Computing (PC) and machine learning.