Jérémie Laydevant

NE
h-index59
8papers
166citations
Novelty56%
AI Score49

8 Papers

OPTICSJul 28, 2023
Quantum-limited stochastic optical neural networks operating at a few quanta per activation

Shi-Yuan Ma, Tianyu Wang, Jérémie Laydevant et al.

Energy efficiency in computation is ultimately limited by noise, with quantum limits setting the fundamental noise floor. Analog physical neural networks hold promise for improved energy efficiency compared to digital electronic neural networks. However, they are typically operated in a relatively high-power regime so that the signal-to-noise ratio (SNR) is large, and the noise can be treated as a perturbation. We study optical neural networks where all layers except the last are operated in the limit that each neuron can be activated by just a single photon, and as a result the noise on neuron activations is no longer merely perturbative. We show that by using a physics-based probabilistic model of the neuron activations in training, it is possible to perform accurate machine-learning inference in spite of the extremely high shot noise (SNR ~ 1). We experimentally demonstrated MNIST handwritten-digit classification with a test accuracy of 98% using an optical neural network with a hidden layer operating in the single-photon regime; the optical energy used to perform the classification corresponds to just 0.038 photons per multiply-accumulate (MAC) operation. Our physics-aware stochastic training approach might also prove useful with non-optical ultra-low-power hardware.

25.2CVApr 13
Ultra-low-light computer vision using trained photon correlations

Mandar M. Sohoni, Jérémie Laydevant, Mathieu Ouellet et al.

Illumination using correlated photon sources has been established as an approach to allowing high-fidelity images to be reconstructed from noisy camera frames by taking advantage of the knowledge that signal photons are spatially correlated whereas detector clicks due to noise are uncorrelated. However, in computer-vision tasks, the goal is often not ultimately to reconstruct an image, but to make inferences about a scene -- such as what object is present. Here we show how correlated-photon illumination can be used to gain an advantage in a hybrid optical-electronic computer-vision pipeline for object recognition. We demonstrate correlation-aware training (CAT): end-to-end optimization of a trainable correlated-photon illumination source and a Transformer backend in a way that the Transformer can learn to benefit from the correlations, using a small number (<= 100) of shots. We show a classification accuracy enhancement of up to 15 percentage points over conventional, uncorrelated-illumination-based computer vision in ultra-low-light and noisy imaging conditions, as well as an improvement over using untrained correlated-photon illumination. Our work illustrates how specializing to a computer-vision task -- object recognition -- and training the pattern of photon correlations in conjunction with a digital backend allows us to push the limits of accuracy in highly photon-budget-constrained scenarios beyond existing methods focused on image reconstruction.

90.0ETApr 18
A fully parallel densely connected probabilistic Ising machine with inertia for real-time applications

Ruomin Zhu, Abhishek Kumar Singh, Jérémie Laydevant et al.

Ising machines -- special-purpose hardware for heuristically solving Ising optimization problems -- based on probabilistic bits (p-bits) have been established as a promising alternative to heuristic optimization algorithms run on conventional computers. However, it has -- until now -- been thought that Ising spins that are connected in probabilistic Ising machines cannot be updated in parallel without ruining the machine's solving ability. This has been a major challenge for using probabilistic Ising machines as fast solvers for densely connected problems. Here, we circumvent this by introducing a modified Ising spin dynamics with an added inertia term, and verify in algorithm simulations, FPGA hardware emulation, and FPGA experiments that it enables fully parallel, synchronous updates while improving rather than degrading success probability. We evaluated on various types of abstract (Max-Cut and Sherrington-Kirkpatrick-model) and application-derived (MIMO, wireless detection) dense Ising benchmark instances. Performing fully parallel updates results in a speed advantage that grows faster than linearly with the number of spins, giving rise to large time-to-solution increases for practical problem sizes. For both Max-Cut and the SK-1 model at a problem size of 200, our approach achieved an average speedup of $\approx 35\times$, with the best single-instance speedup reaching $150\times$. As an example of the practical utility of our approach in an application where speed is critical, we further show by co-designing the algorithm dynamics with the hardware implementation -- co-optimizing for solver ability and silicon resource usage -- that probabilistic Ising machines based on our approach satisfy the stringent solution quality and latency/throughput requirements for real-time MIMO detection in modern 5G cellular wireless networks while using a practically reasonable silicon area.

62.8OPTICSMay 11
Measurement-Adapted Eigentask Representations for Photon-Limited Optical Readout

Tianyang Chen, Mandar M. Sohoni, Saeed A. Khan et al.

Optical readout in low-light imaging is fundamentally limited by measurement noise, including photon shot noise, detector noise, and quantization error. In this regime, downstream inference depends not only on the optical front end, but also on how noisy high-dimensional sensor measurements are represented before classification or decision-making. Here we show that eigentasks provide a measurement-adapted representation for optical sensor outputs by ordering readout features according to their resolvability under noise. Using experimental data from a lens-based optical imaging system and a reanalysis of published data from a single-photon-detection neural network, we find that eigentask representations frequently outperform standard baselines including principal component analysis and filtering-based compression. The advantage is most pronounced in photon-limited, few-shot, and higher-difficulty classification regimes. In few-shot MPEG-7 classification, for example, the advantage over other methods reaches about 10 percentage points as the number of classes increases. In these settings, eigentasks yield more informative low-dimensional features and improve sample-efficient downstream learning. These results identify measurement-adapted representation as a promising strategy for optical inference when photon budget, acquisition time, and task complexity are constrained.

NEMar 18, 2024
Unsupervised End-to-End Training with a Self-Defined Target

Dongshu Liu, Jérémie Laydevant, Adrien Pontlevy et al.

Designing algorithms for versatile AI hardware that can learn on the edge using both labeled and unlabeled data is challenging. Deep end-to-end training methods incorporating phases of self-supervised and supervised learning are accurate and adaptable to input data but self-supervised learning requires even more computational and memory resources than supervised learning, too high for current embedded hardware. Conversely, unsupervised layer-by-layer training, such as Hebbian learning, is more compatible with existing hardware but does not integrate well with supervised learning. To address this, we propose a method enabling networks or hardware designed for end-to-end supervised learning to also perform high-performance unsupervised learning by adding two simple elements to the output layer: Winner-Take-All (WTA) selectivity and homeostasis regularization. These mechanisms introduce a "self-defined target" for unlabeled data, allowing purely unsupervised training for both fully-connected and convolutional layers using backpropagation or equilibrium propagation on datasets like MNIST (up to 99.2%), Fashion-MNIST (up to 90.3%), and SVHN (up to 81.5%). We extend this method to semi-supervised learning, adjusting targets based on data type, achieving 96.6% accuracy with only 600 labeled MNIST samples in a multi-layer perceptron. Our results show that this approach can effectively enable networks and hardware initially dedicated to supervised learning to also perform unsupervised learning, adapting to varying availability of labeled data.

NEMay 22, 2023
Training an Ising Machine with Equilibrium Propagation

Jérémie Laydevant, Danijela Markovic, Julie Grollier

Ising machines, which are hardware implementations of the Ising model of coupled spins, have been influential in the development of unsupervised learning algorithms at the origins of Artificial Intelligence (AI). However, their application to AI has been limited due to the complexities in matching supervised training methods with Ising machine physics, even though these methods are essential for achieving high accuracy. In this study, we demonstrate a novel approach to train Ising machines in a supervised way through the Equilibrium Propagation algorithm, achieving comparable results to software-based implementations. We employ the quantum annealing procedure of the D-Wave Ising machine to train a fully-connected neural network on the MNIST dataset. Furthermore, we demonstrate that the machine's connectivity supports convolution operations, enabling the training of a compact convolutional network with minimal spins per neuron. Our findings establish Ising machines as a promising trainable hardware platform for AI, with the potential to enhance machine learning applications.

NEMar 16, 2021
Training Dynamical Binary Neural Networks with Equilibrium Propagation

Jérémie Laydevant, Maxence Ernoult, Damien Querlioz et al.

Equilibrium Propagation (EP) is an algorithm intrinsically adapted to the training of physical networks, thanks to the local updates of weights given by the internal dynamics of the system. However, the construction of such a hardware requires to make the algorithm compatible with existing neuromorphic CMOS technologies, which generally exploit digital communication between neurons and offer a limited amount of local memory. In this work, we demonstrate that EP can train dynamical networks with binary activations and weights. We first train systems with binary weights and full-precision activations, achieving an accuracy equivalent to that of full-precision models trained by standard EP on MNIST, and losing only 1.9% accuracy on CIFAR-10 with equal architecture. We then extend our method to the training of models with binary activations and weights on MNIST, achieving an accuracy within 1% of the full-precision reference for fully connected architectures and reaching the full-precision accuracy for convolutional architectures. Our extension of EP to binary networks opens new solutions for on-chip learning and provides a compact framework for training BNNs end-to-end with the same circuitry as for inference.

NEOct 15, 2020
EqSpike: Spike-driven Equilibrium Propagation for Neuromorphic Implementations

Erwann Martin, Maxence Ernoult, Jérémie Laydevant et al.

Finding spike-based learning algorithms that can be implemented within the local constraints of neuromorphic systems, while achieving high accuracy, remains a formidable challenge. Equilibrium Propagation is a promising alternative to backpropagation as it only involves local computations, but hardware-oriented studies have so far focused on rate-based networks. In this work, we develop a spiking neural network algorithm called EqSpike, compatible with neuromorphic systems, which learns by Equilibrium Propagation. Through simulations, we obtain a test recognition accuracy of 97.6% on MNIST, similar to rate-based Equilibrium Propagation, and comparing favourably to alternative learning techniques for spiking neural networks. We show that EqSpike implemented in silicon neuromorphic technology could reduce the energy consumption of inference and training respectively by three orders and two orders of magnitude compared to GPUs. Finally, we also show that during learning, EqSpike weight updates exhibit a form of Spike Timing Dependent Plasticity, highlighting a possible connection with biology.