Extraction of linearized models from pre-trained networks via knowledge distillation
This work addresses the need for linearized models in hardware like photonic integrated circuits, but it is incremental as it builds on existing Koopman and distillation methods.
The authors tackled the problem of constructing machine learning models that rely only on linear operations after simple nonlinear preprocessing, by proposing a framework to extract a linearized model from a pre-trained neural network using Koopman operator theory and knowledge distillation. The result showed that the proposed model consistently outperformed conventional least-squares-based Koopman approximation in classification accuracy and numerical stability on MNIST and Fashion-MNIST datasets.
Recent developments in hardware, such as photonic integrated circuits and optical devices, are driving demand for research on constructing machine learning architectures tailored for linear operations. Hence, it is valuable to explore methods for constructing learning machines with only linear operations after simple nonlinear preprocessing. In this study, we propose a framework to extract a linearized model from a pre-trained neural network for classification tasks by integrating Koopman operator theory with knowledge distillation. Numerical demonstrations on the MNIST and the Fashion-MNIST datasets reveal that the proposed model consistently outperforms the conventional least-squares-based Koopman approximation in both classification accuracy and numerical stability.