Opto-Electronic Convolutional Neural Network Design Via Direct Kernel Optimization
This work addresses the problem of efficient and stable optimization for opto-electronic neural networks, which is incremental as it builds on existing methods with a novel two-stage approach.
The paper tackles the challenge of designing opto-electronic convolutional neural networks by introducing a two-stage strategy that first trains an electronic CNN and then optimizes the optical front-end directly, reducing computational demands by hundreds of times and achieving twice the accuracy in monocular depth estimation compared to end-to-end training.
Opto-electronic neural networks integrate optical front-ends with electronic back-ends to enable fast and energy-efficient vision. However, conventional end-to-end optimization of both the optical and electronic modules is limited by costly simulations and large parameter spaces. We introduce a two-stage strategy for designing opto-electronic convolutional neural networks (CNNs): first, train a standard electronic CNN, then realize the optical front-end implemented as a metasurface array through direct kernel optimization of its first convolutional layer. This approach reduces computational and memory demands by hundreds of times and improves training stability compared to end-to-end optimization. On monocular depth estimation, the proposed two-stage design achieves twice the accuracy of end-to-end training under the same training time and resource constraints.