ETMay 18, 2022
Single-Shot Optical Neural NetworkLiane Bernstein, Alexander Sludds, Christopher Panuski et al.
As deep neural networks (DNNs) grow to solve increasingly complex problems, they are becoming limited by the latency and power consumption of existing digital processors. For improved speed and energy efficiency, specialized analog optical and electronic hardware has been proposed, however, with limited scalability (input vector length $K$ of hundreds of elements). Here, we present a scalable, single-shot-per-layer analog optical processor that uses free-space optics to reconfigurably distribute an input vector and integrated optoelectronics for static, updatable weighting and the nonlinearity -- with $K \approx 1,000$ and beyond. We experimentally test classification accuracy of the MNIST handwritten digit dataset, achieving 94.7% (ground truth 96.3%) without data preprocessing or retraining on the hardware. We also determine the fundamental upper bound on throughput ($\sim$0.9 exaMAC/s), set by the maximum optical bandwidth before significant increase in error. Our combination of wide spectral and spatial bandwidths in a CMOS-compatible system enables highly efficient computing for next-generation DNNs.
ETJul 8, 2022
RF-Photonic Deep Learning Processor with Shannon-Limited Data MovementRonald Davis, Zaijun Chen, Ryan Hamerly et al.
Edholm's Law predicts exponential growth in data rate and spectrum bandwidth for communications and is forecasted to remain true for the upcoming deployment of 6G. Compounding this issue is the exponentially increasing demand for deep neural network (DNN) compute, including DNNs for signal processing. However, the slowing of Moore's Law due to the limitations of transistor-based electronics means that completely new paradigms for computing will be required to meet these increasing demands for advanced communications. Optical neural networks (ONNs) are promising DNN accelerators with ultra-low latency and energy consumption. Yet state-of-the-art ONNs struggle with scalability and implementing linear with in-line nonlinear operations. Here we introduce our multiplicative analog frequency transform ONN (MAFT-ONN) that encodes the data in the frequency domain, achieves matrix-vector products in a single shot using photoelectric multiplication, and uses a single electro-optic modulator for the nonlinear activation of all neurons in each layer. We experimentally demonstrate the first hardware accelerator that computes fully-analog deep learning on raw RF signals, performing single-shot modulation classification with 85% accuracy, where a 'majority vote' multi-measurement scheme can boost the accuracy to 95% within 5 consecutive measurements. In addition, we demonstrate frequency-domain finite impulse response (FIR) linear-time-invariant (LTI) operations, enabling a powerful combination of traditional and AI signal processing. We also demonstrate the scalability of our architecture by computing nearly 4 million fully-analog multiplies-and-accumulates for MNIST digit classification. Our latency estimation model shows that due to the Shannon capacity-limited analog data movement, MAFT-ONN is hundreds of times faster than traditional RF receivers operating at their theoretical peak performance.
QUANT-PHAug 10, 2024
Quantum-secure multiparty deep learningKfir Sulimany, Sri Krishna Vadlamani, Ryan Hamerly et al.
Secure multiparty computation enables the joint evaluation of multivariate functions across distributed users while ensuring the privacy of their local inputs. This field has become increasingly urgent due to the exploding demand for computationally intensive deep learning inference. These computations are typically offloaded to cloud computing servers, leading to vulnerabilities that can compromise the security of the clients' data. To solve this problem, we introduce a linear algebra engine that leverages the quantum nature of light for information-theoretically secure multiparty computation using only conventional telecommunication components. We apply this linear algebra engine to deep learning and derive rigorous upper bounds on the information leakage of both the deep neural network weights and the client's data via the Holevo and the Cramér-Rao bounds, respectively. Applied to the MNIST classification task, we obtain test accuracies exceeding $96\%$ while leaking less than $0.1$ bits per weight symbol and $0.01$ bits per data symbol. This weight leakage is an order of magnitude below the minimum bit precision required for accurate deep learning using state-of-the-art quantization techniques. Our work lays the foundation for practical quantum-secure computation and unlocks secure cloud deep learning as a field.
SYFeb 23Code
Agentic AI for Scalable and Robust Optical Systems ControlZehao Wang, Mingzhe Han, Wei Cheng et al.
We present AgentOptics, an agentic AI framework for high-fidelity, autonomous optical system control built on the Model Context Protocol (MCP). AgentOptics interprets natural language tasks and executes protocol-compliant actions on heterogeneous optical devices through a structured tool abstraction layer. We implement 64 standardized MCP tools across 8 representative optical devices and construct a 410-task benchmark to evaluate request understanding, role-aware responses, multi-step coordination, robustness to linguistic variation, and error handling. We assess two deployment configurations--commercial online LLMs and locally hosted open-source LLMs--and compare them with LLM-based code generation baselines. AgentOptics achieves 87.7%--99.0% average task success rates, significantly outperforming code-generation approaches, which reach up to 50% success. We further demonstrate broader applicability through five case studies extending beyond device-level control to system orchestration, monitoring, and closed-loop optimization. These include DWDM link provisioning and coordinated monitoring of coherent 400 GbE and analog radio-over-fiber (ARoF) channels; autonomous characterization and bias optimization of a wideband ARoF link carrying 5G fronthaul traffic; multi-span channel provisioning with launch power optimization; closed-loop fiber polarization stabilization; and distributed acoustic sensing (DAS)-based fiber monitoring with LLM-assisted event detection. These results establish AgentOptics as a scalable, robust paradigm for autonomous control and orchestration of heterogeneous optical systems.
QUANT-PHMar 26
T Count as a Numerically Solvable Minimization ProblemMarc Grau Davis, Ed Younis, Mathias Weiden et al.
We present a formulation of the problem of finding the smallest T -Count circuit that implements a given unitary as a binary search over a sequence of continuous minimization problems, and demonstrate that these problems are numerically solvable in practice. We reproduce best-known results for synthesis of circuits with a small number of qubits, and push the bounds of the largest circuits that can be solved for in this way. Additionally, we show that circuit partitioning can be used to adapt this technique to be used to optimize the T -Count of circuits with large numbers of qubits by breaking the circuit into a series of smaller sub-circuits that can be optimized independently.
ETApr 24, 2025
Disaggregated Deep Learning via In-Physics Computing at Radio FrequencyZhihui Gao, Sri Krishna Vadlamani, Kfir Sulimany et al.
Modern edge devices, such as cameras, drones, and Internet-of-Things nodes, rely on deep learning to enable a wide range of intelligent applications, including object recognition, environment perception, and autonomous navigation. However, deploying deep learning models directly on the often resource-constrained edge devices demands significant memory footprints and computational power for real-time inference using traditional digital computing architectures. In this paper, we present WISE, a novel computing architecture for wireless edge networks designed to overcome energy constraints in deep learning inference. WISE achieves this goal through two key innovations: disaggregated model access via wireless broadcasting and in-physics computation of general complex-valued matrix-vector multiplications directly at radio frequency. Using a software-defined radio platform with wirelessly broadcast model weights over the air, we demonstrate that WISE achieves 95.7% image classification accuracy with ultra-low operation power of 6.0 fJ/MAC per client, corresponding to a computation efficiency of 165.8 TOPS/W. This approach enables energy-efficient deep learning inference on wirelessly connected edge devices, achieving more than two orders of magnitude improvement in efficiency compared to traditional digital computing.
AIOct 14, 2025
Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum PhysicsBenjamin Breen, Marco Del Tredici, Jacob McCarran et al.
We present Ax-Prover, a multi-agent system for automated theorem proving in Lean that can solve problems across diverse scientific domains and operate either autonomously or collaboratively with human experts. To achieve this, Ax-Prover approaches scientific problem solving through formal proof generation, a process that demands both creative reasoning and strict syntactic rigor. Ax-Prover meets this challenge by equipping Large Language Models (LLMs), which provide knowledge and reasoning, with Lean tools via the Model Context Protocol (MCP), which ensure formal correctness. To evaluate its performance as an autonomous prover, we benchmark our approach against frontier LLMs and specialized prover models on two public math benchmarks and on two Lean benchmarks we introduce in the fields of abstract algebra and quantum theory. On public datasets, Ax-Prover is competitive with state-of-the-art provers, while it largely outperforms them on the new benchmarks. This shows that, unlike specialized systems that struggle to generalize, our tool-based agentic theorem prover approach offers a generalizable methodology for formal verification across diverse scientific domains. Furthermore, we demonstrate Ax-Prover's assistant capabilities in a practical use case, showing how it enabled an expert mathematician to formalize the proof of a complex cryptography theorem.
APP-PHSep 19, 2025
LightCode: Compiling LLM Inference for Photonic-Electronic SystemsRyan Tomich, Zhizhen Zhong, Dirk Englund
The growing demand for low-latency, energy-efficient inference in large language models (LLMs) has catalyzed interest in heterogeneous architectures. While GPUs remain dominant, they are poorly suited for integration with emerging domain-specific accelerators like the Photonic Tensor Units (PTUs), which offer low-power, high-throughput linear computation. This motivates hybrid compilation strategies that combine photonic and electronic resources. We present LightCode, a compiler framework and simulator for mapping LLM inference workloads across hybrid photonic-electronic systems. LightCode introduces the Stacked Graph, an intermediate representation that encodes multiple hardware-specific realizations of each tensor operation. Hardware assignment is formulated as a constrained subgraph selection problem optimized for latency or energy under parametric cost models. We evaluate LightCode on the prefill stage of GPT-2 and Llama-7B showing that under our workload and hardware assumptions, (i) Photonic hardware reduced energy by up to 50% in our simulated workloads at maximum sequence length; (ii) multiplexing and assignment strategy yielded latency improvements exceeding 10x; and (iii) Optimizing for latency or energy resulted in distinct hardware mappings in our simulations. LightCode offers a module, foundational framework and simulator for compiling LLMs to emerging photonic accelerators.
ETJun 13, 2025
Machine Intelligence on Wireless Edge NetworksSri Krishna Vadlamani, Kfir Sulimany, Zhihui Gao et al.
Machine intelligence on edge devices enables low-latency processing and improved privacy, but is often limited by the energy and delay of moving and converting data. Current systems frequently avoid local model storage by sending queries to a server, incurring uplink cost, network latency, and privacy risk. We present the opposite approach: broadcasting model weights to clients that perform inference locally using in-physics computation inside the radio receive chain. A base station transmits weights as radio frequency (RF) waveforms; the client encodes activations onto the waveform and computes the result using existing mixer and filter stages, RF components already present in billions of edge devices such as cellphones, eliminating repeated signal conversions and extra hardware. Analysis shows that thermal noise and nonlinearity create an optimal energy window for accurate analog inner products. Hardware-tailored training through a differentiable RF chain preserves accuracy within this regime. Circuit-informed simulations, consistent with a companion experiment, demonstrate reduced memory and conversion overhead while maintaining high accuracy in realistic wireless edge scenarios.
QUANT-PHSep 23, 2021
Quantum algorithms for group convolution, cross-correlation, and equivariant transformationsGrecia Castelazo, Quynh T. Nguyen, Giacomo De Palma et al.
Group convolutions and cross-correlations, which are equivariant to the actions of group elements, are commonly used in mathematics to analyze or take advantage of symmetries inherent in a given problem setting. Here, we provide efficient quantum algorithms for performing linear group convolutions and cross-correlations on data stored as quantum states. Runtimes for our algorithms are logarithmic in the dimension of the group thus offering an exponential speedup compared to classical algorithms when input data is provided as a quantum state and linear operations are well conditioned. Motivated by the rich literature on quantum algorithms for solving algebraic problems, our theoretical framework opens a path for quantizing many algorithms in machine learning and numerical methods that employ group operations.