Thomas Fischbacher

LG
6papers
261citations
Novelty34%
AI Score34

6 Papers

OCJan 19, 2023
On backpropagating Hessians through ODEs

Axel Ciceri, Thomas Fischbacher

We discuss the problem of numerically backpropagating Hessians through ordinary differential equations (ODEs) in various contexts and elucidate how different approaches may be favourable in specific situations. We discuss both theoretical and pragmatic aspects such as, respectively, bounds on computational effort and typical impact of framework overhead. Focusing on the approach of hand-implemented ODE-backpropagation, we develop the computation for the Hessian of orbit-nonclosure for a mechanical system. We also clarify the mathematical framework for extending the backward-ODE-evolution of the costate-equation to Hessians, in its most generic form. Some calculations, such as that of the Hessian for orbit non-closure, are performed in a language, defined in terms of a formal grammar, that we introduce to facilitate the tracking of intermediate quantities. As pedagogical examples, we discuss the Hessian of orbit-nonclosure for the higher dimensional harmonic oscillator and conceptually related problems in Newtonian gravitational theory. In particular, applying our approach to the figure-8 three-body orbit, we readily rediscover a distorted-figure-8 solution originally described by Simó. Possible applications may include: improvements to training of `neural ODE'- type deep learning with second-order methods, numerical analysis of quantum corrections around classical paths, and, more broadly, studying options for adjusting an ODE's initial configuration such that the impact on some given objective function is small.

DCNov 22, 2025
Towards a future space-based, highly scalable AI infrastructure system design

Blaise Agüera y Arcas, Travis Beals, Maria Biggs et al.

If AI is a foundational general-purpose technology, we should anticipate that demand for AI compute -- and energy -- will continue to grow. The Sun is by far the largest energy source in our solar system, and thus it warrants consideration how future AI infrastructure could most efficiently tap into that power. This work explores a scalable compute system for machine learning in space, using fleets of satellites equipped with solar arrays, inter-satellite links using free-space optics, and Google tensor processing unit (TPU) accelerator chips. To facilitate high-bandwidth, low-latency inter-satellite communication, the satellites would be flown in close proximity. We illustrate the basic approach to formation flight via a 81-satellite cluster of 1 km radius, and describe an approach for using high-precision ML-based models to control large-scale constellations. Trillium TPUs are radiation tested. They survive a total ionizing dose equivalent to a 5 year mission life without permanent failures, and are characterized for bit-flip errors. Launch costs are a critical part of overall system cost; a learning curve analysis suggests launch to low-Earth orbit (LEO) may reach $\lesssim$\$200/kg by the mid-2030s.

LGAug 13, 2020
Single-Photon Image Classification

Thomas Fischbacher, Luciano Sbaiz

Quantum computing-based machine learning mainly focuses on quantum computing hardware that is experimentally challenging to realize due to requiring quantum gates that operate at very low temperature. Instead, we demonstrate the existence of a lower performance and much lower effort island on the accuracy-vs-qubits graph that may well be experimentally accessible with room temperature optics. This high temperature "quantum computing toy model" is nevertheless interesting to study as it allows rather accessible explanations of key concepts in quantum computing, in particular interference, entanglement, and the measurement process. We specifically study the problem of classifying an example from the MNIST and Fashion-MNIST datasets, subject to the constraint that we have to make a prediction after the detection of the very first photon that passed a coherently illuminated filter showing the example. Whereas a classical set-up in which a photon is detected after falling on one of the $28\times 28$ image pixels is limited to a (maximum likelihood estimation) accuracy of $21.27\%$ for MNIST, respectively $18.27\%$ for Fashion-MNIST, we show that the theoretically achievable accuracy when exploiting inference by optically transforming the quantum state of the photon is at least $41.27\%$ for MNIST, respectively $36.14\%$ for Fashion-MNIST. We show in detail how to train the corresponding transformation with TensorFlow and also explain how this example can serve as a teaching tool for the measurement process in quantum mechanics.

LGAug 10, 2020
Intelligent Matrix Exponentiation

Thomas Fischbacher, Iulia M. Comsa, Krzysztof Potempa et al.

We present a novel machine learning architecture that uses the exponential of a single input-dependent matrix as its only nonlinearity. The mathematical simplicity of this architecture allows a detailed analysis of its behaviour, providing robustness guarantees via Lipschitz bounds. Despite its simplicity, a single matrix exponential layer already provides universal approximation properties and can learn fundamental functions of the input, such as periodic functions or multivariate polynomials. This architecture outperforms other general-purpose architectures on benchmark problems, including CIFAR-10, using substantially fewer parameters.

NEJul 30, 2019
Temporal Coding in Spiking Neural Networks with Alpha Synaptic Function: Learning with Backpropagation

Iulia M. Comsa, Krzysztof Potempa, Luca Versari et al.

The timing of individual neuronal spikes is essential for biological brains to make fast responses to sensory stimuli. However, conventional artificial neural networks lack the intrinsic temporal coding ability present in biological networks. We propose a spiking neural network model that encodes information in the relative timing of individual neuron spikes. In classification tasks, the output of the network is indicated by the first neuron to spike in the output layer. This temporal coding scheme allows the supervised training of the network with backpropagation, using locally exact derivatives of the postsynaptic spike times with respect to presynaptic spike times. The network operates using a biologically-plausible alpha synaptic transfer function. Additionally, we use trainable synchronisation pulses that provide bias, add flexibility during training and exploit the decay part of the alpha function. We show that such networks can be trained successfully on noisy Boolean logic tasks and on the MNIST dataset encoded in time. The results show that the spiking neural network outperforms comparable spiking models on MNIST and achieves similar quality to fully connected conventional networks with the same architecture. We also find that the spiking network spontaneously discovers two operating regimes, mirroring the accuracy-speed trade-off observed in human decision-making: a slow regime, where a decision is taken after all hidden neurons have spiked and the accuracy is very high, and a fast regime, where a decision is taken very fast but the accuracy is lower. These results demonstrate the computational power of spiking networks with biological characteristics that encode information in the timing of individual neurons. By studying temporal coding in spiking networks, we aim to create building blocks towards energy-efficient and more complex biologically-inspired neural architectures.

HEP-THJun 1, 2019
SO(8) Supergravity and the Magic of Machine Learning

Iulia M. Comsa, Moritz Firsching, Thomas Fischbacher

Using de Wit-Nicolai $D=4\;\mathcal{N}=8\;SO(8)$ supergravity as an example, we show how modern Machine Learning software libraries such as Google's TensorFlow can be employed to greatly simplify the analysis of high-dimensional scalar sectors of some M-Theory compactifications. We provide detailed information on the location, symmetries, and particle spectra and charges of 192 critical points on the scalar manifold of SO(8) supergravity, including one newly discovered $\mathcal{N}=1$ vacuum with $SO(3)$ residual symmetry, one new potentially stabilizable non-supersymmetric solution, and examples for "Galois conjugate pairs" of solutions, i.e. solution-pairs that share the same gauge group embedding into~$SO(8)$ and minimal polynomials for the cosmological constant. Where feasible, we give analytic expressions for solution coordinates and cosmological constants. As the authors' aspiration is to present the discussion in a form that is accessible to both the Machine Learning and String Theory communities and allows adopting our methods towards the study of other models, we provide an introductory overview over the relevant Physics as well as Machine Learning concepts. This includes short pedagogical code examples. In particular, we show how to formulate a requirement for residual Supersymmetry as a Machine Learning loss function and effectively guide the numerical search towards supersymmetric critical points. Numerical investigations suggest that there are no further supersymmetric vacua beyond this newly discovered fifth solution.