OPTICSNECOMP-PHOct 24, 2020

Scale-, shift- and rotation-invariant diffractive optical networks

arXiv:2010.12747v175 citations
Originality Incremental advance
AI Analysis

This work addresses a critical limitation in optical neural networks for applications like autonomous cars and biomedical imaging, though it is incremental as it builds on existing diffractive network designs.

The paper tackled the sensitivity of diffractive optical networks to spatial transformations of input objects by introducing a training strategy that incorporates random scaling, translation, and rotation during training, resulting in networks that achieve invariance to these transformations for improved performance in dynamic machine vision applications.

Recent research efforts in optical computing have gravitated towards developing optical neural networks that aim to benefit from the processing speed and parallelism of optics/photonics in machine learning applications. Among these endeavors, Diffractive Deep Neural Networks (D2NNs) harness light-matter interaction over a series of trainable surfaces, designed using deep learning, to compute a desired statistical inference task as the light waves propagate from the input plane to the output field-of-view. Although, earlier studies have demonstrated the generalization capability of diffractive optical networks to unseen data, achieving e.g., >98% image classification accuracy for handwritten digits, these previous designs are in general sensitive to the spatial scaling, translation and rotation of the input objects. Here, we demonstrate a new training strategy for diffractive networks that introduces input object translation, rotation and/or scaling during the training phase as uniformly distributed random variables to build resilience in their blind inference performance against such object transformations. This training strategy successfully guides the evolution of the diffractive optical network design towards a solution that is scale-, shift- and rotation-invariant, which is especially important and useful for dynamic machine vision applications in e.g., autonomous cars, in-vivo imaging of biomedical specimen, among others.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes