OPTICSJan 16, 2023
Efficient data transport over multimode light-pipes with Megapixel images using differentiable ray tracing and Machine-learningJoowon Lim, Jannes Gladrow, Douglas Kelly et al.
Retrieving images transmitted through multi-mode fibers is of growing interest, thanks to their ability to confine and transport light efficiently in a compact system. Here, we demonstrate machine-learning-based decoding of large-scale digital images (pages), maximizing page capacity for optical storage applications. Using a millimeter-sized square cross-section waveguide, we image an 8-bit spatial light modulator, presenting data as a matrix of symbols. Normally, decoders will incur a prohibitive O(n^2) computational scaling to decode n symbols in spatially scrambled data. However, by combining a digital twin of the setup with a U-Net, we can retrieve up to 66 kB using efficient convolutional operations only. We compare trainable ray-tracing-based with eigenmode-based twins and show the former to be superior thanks to its ability to overcome the simulation-to-experiment gap by adjusting to optical imperfections. We train the pipeline end-to-end using a differentiable mutual-information estimator based on the von-Mises distribution, generally applicable to phase-coding channels.
LGApr 3
Amortized Inference of Causal Models via Conditional Fixed-Point IterationsDivyat Mahajan, Jannes Gladrow, Agrin Hilmkil et al.
Structural Causal Models (SCMs) offer a principled framework to reason about interventions and support out-of-distribution generalization, which are key goals in scientific discovery. However, the task of learning SCMs from observed data poses formidable challenges, and often requires training a separate model for each dataset. In this work, we propose an amortized inference framework that trains a single model to predict the causal mechanisms of SCMs conditioned on their observational data and causal graph. We first use a transformer-based architecture for amortized learning of dataset embeddings, and then extend the Fixed-Point Approach (FiP) to infer the causal mechanisms conditionally on their dataset embeddings. As a byproduct, our method can generate observational and interventional data from novel SCMs at inference time, without updating parameters. Empirical results show that our amortized procedure performs on par with baselines trained specifically for each dataset on both in and out-of-distribution problems, and also outperforms them in scarce data regimes.
LGFeb 10, 2025Code
Implicit Language Models are RNNs: Balancing Parallelization and ExpressivityMark Schöne, Babak Rahmani, Heiner Kremer et al.
State-space models (SSMs) and transformers dominate the language modeling landscape. However, they are constrained to a lower computational complexity than classical recurrent neural networks (RNNs), limiting their expressivity. In contrast, RNNs lack parallelization during training, raising fundamental questions about the trade off between parallelization and expressivity. We propose implicit SSMs, which iterate a transformation until convergence to a fixed point. Theoretically, we show that implicit SSMs implement the non-linear state-transitions of RNNs. Empirically, we find that only approximate fixed-point convergence suffices, enabling the design of a scalable training curriculum that largely retains parallelization, with full convergence required only for a small subset of tokens. Our approach demonstrates superior state-tracking capabilities on regular languages, surpassing transformers and SSMs. We further scale implicit SSMs to natural language reasoning tasks and pretraining of large-scale language models up to 1.3B parameters on 207B tokens representing, to our knowledge, the largest implicit model trained to date. Notably, our implicit models outperform their explicit counterparts on standard benchmarks. Our code is publicly available at http://github.com/microsoft/implicit_languagemodels .
IVNov 3, 2019
Digital phase-only holography using deep conditional generative modelsJannes Gladrow
Holographic wave-shaping has found numerous applications across the physical sciences, especially since the development of digital spatial-light modulators (SLMs). A key challenge in digital holography consists in finding optimal hologram patterns which transform the incoming laser beam into desired shapes in a conjugate optical plane. The existing repertoire of approaches to solve this inverse problem is built on iterative phase-retrieval algorithms, which do not take optical aberrations and deviations from theoretical models into account. Here, we adopt a physics-free, data-driven, and probabilistic approach to the problem. Using deep conditional generative models such as Generative-Adversarial Networks (cGAN) or Variational Autoencoder (cVAE), we approximate conditional distributions of holograms for a given target laser intensity pattern. In order to reduce the cardinality of the problem, we train our models on a proxy mapping relating an 8x8-matrix of complex-valued spatial-frequency coefficients to the ensuing 100x100-shaped intensity distribution recorded on a camera. We discuss the degree of 'ill-posedness' that remains in this reduced problem and compare different generative model architectures in terms of their ability to find holograms that reconstruct given intensity patterns. Finally, we challenge our models to generalise to synthetic target intensities, where the existence of matching holograms cannot be guaranteed. We devise a forward-interpolating training scheme aimed at providing models the ability to interpolate in laser intensity space, rather than hologram space and show that this indeed enhances model performance on synthetic data sets.