Addressing Data Scarcity in Optical Matrix Multiplier Modeling Using Transfer Learning
This work addresses data scarcity for researchers modeling optical computing hardware, though it is incremental as it applies known transfer learning techniques to a specific domain.
The paper tackled the problem of data scarcity in modeling optical matrix multipliers by using transfer learning with synthetic pre-training and experimental fine-tuning, achieving less than 1 dB root-mean-square error on a 3x3 photonic chip with only 25% of the available data.
We present and experimentally evaluate using transfer learning to address experimental data scarcity when training neural network (NN) models for Mach-Zehnder interferometer mesh-based optical matrix multipliers. Our approach involves pre-training the model using synthetic data generated from a less accurate analytical model and fine-tuning with experimental data. Our investigation demonstrates that this method yields significant reductions in modeling errors compared to using an analytical model, or a standalone NN model when training data is limited. Utilizing regularization techniques and ensemble averaging, we achieve < 1 dB root-mean-square error on the matrix weights implemented by a 3x3 photonic chip while using only 25% of the available data.