CVOPTICSNov 5, 2024

Transferable polychromatic optical encoder for neural networks

arXiv:2411.02697v113 citationsh-index: 7Nat Commun
Originality Highly original
AI Analysis

This addresses the problem of real-time operation and energy efficiency in computer vision systems, offering a hybrid optical/digital solution with potential for broad application, though it is incremental in combining optical and digital methods.

The paper tackles the high computational demands of artificial neural networks for image processing by introducing an optical encoder that performs convolution in three color channels during image capture, achieving a ~24,000 times reduction in operations with ~73.2% classification accuracy on CIFAR-10 and transferability to ImageNet subsets.

Artificial neural networks (ANNs) have fundamentally transformed the field of computer vision, providing unprecedented performance. However, these ANNs for image processing demand substantial computational resources, often hindering real-time operation. In this paper, we demonstrate an optical encoder that can perform convolution simultaneously in three color channels during the image capture, effectively implementing several initial convolutional layers of a ANN. Such an optical encoding results in ~24,000 times reduction in computational operations, with a state-of-the art classification accuracy (~73.2%) in free-space optical system. In addition, our analog optical encoder, trained for CIFAR-10 data, can be transferred to the ImageNet subset, High-10, without any modifications, and still exhibits moderate accuracy. Our results evidence the potential of hybrid optical/digital computer vision system in which the optical frontend can pre-process an ambient scene to reduce the energy and latency of the whole computer vision system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes