Ivan Dokmanić

LG
h-index15
48papers
1,389citations
Novelty58%
AI Score56

48 Papers

LGJun 1
Flowers: A Warp Drive for Neural PDE Solvers

Till Muser, Alexandra Spitzer, Matti Lassas et al.

We introduce Flowers, a neural architecture for learning PDE solution operators built entirely from multihead warps. Aside from pointwise channel mixing and a multiscale scaffold, Flowers use no Fourier multipliers, no dot-product attention, and no convolutional mixing. Each head predicts a displacement field and warps the mixed input features. Motivated by physics and computational efficiency, displacements are predicted pointwise, without any spatial aggregation, and nonlocality enters only through sparse sampling at source coordinates, one per head. Stacking warps in multiscale residual blocks yields Flowers, which implement adaptive, global interactions at linear cost. We theoretically motivate this design through three complementary lenses: flow maps for conservation laws, waves in inhomogeneous media, and a kinetic-theoretic continuum limit. Flowers achieve excellent performance on a broad suite of 2D and 3D time-dependent PDE benchmarks, particularly flows and waves. A compact 17M-parameter model consistently outperforms Fourier, convolution, and attention-based baselines of similar size, while a 150M-parameter variant improves over recent transformer-based foundation models with much more parameters, data, and training compute.

LGJun 9, 2023Code
A Graph Dynamics Prior for Relational Inference

Liming Pan, Cheng Shi, Ivan Dokmanić

Relational inference aims to identify interactions between parts of a dynamical system from the observed dynamics. Current state-of-the-art methods fit the dynamics with a graph neural network (GNN) on a learnable graph. They use one-step message-passing GNNs -- intuitively the right choice since non-locality of multi-step or spectral GNNs may confuse direct and indirect interactions. But the \textit{effective} interaction graph depends on the sampling rate and it is rarely localized to direct neighbors, leading to poor local optima for the one-step model. In this work, we propose a \textit{graph dynamics prior} (GDP) for relational inference. GDP constructively uses error amplification in non-local polynomial filters to steer the solution to the ground-truth graph. To deal with non-uniqueness, GDP simultaneously fits a ``shallow'' one-step model and a polynomial multi-step model with shared graph topology. Experiments show that GDP reconstructs graphs far more accurately than earlier methods, with remarkable robustness to under-sampling. Since appropriate sampling rates for unknown dynamical systems are not known a priori, this robustness makes GDP suitable for real applications in scientific machine learning. Reproducible code is available at https://github.com/DaDaCheng/GDP.

DIS-NNFeb 27, 2023
Injectivity of ReLU networks: perspectives from statistical physics

Antoine Maillard, Afonso S. Bandeira, David Belius et al.

When can the input of a ReLU neural network be inferred from its output? In other words, when is the network injective? We consider a single layer, $x \mapsto \mathrm{ReLU}(Wx)$, with a random Gaussian $m \times n$ matrix $W$, in a high-dimensional setting where $n, m \to \infty$. Recent work connects this problem to spherical integral geometry giving rise to a conjectured sharp injectivity threshold for $α= \frac{m}{n}$ by studying the expected Euler characteristic of a certain random set. We adopt a different perspective and show that injectivity is equivalent to a property of the ground state of the spherical perceptron, an important spin glass model in statistical physics. By leveraging the (non-rigorous) replica symmetry-breaking theory, we derive analytical equations for the threshold whose solution is at odds with that from the Euler characteristic. Furthermore, we use Gordon's min--max theorem to prove that a replica-symmetric upper bound refutes the Euler characteristic prediction. Along the way we aim to give a tutorial-style introduction to key ideas from statistical physics in an effort to make the exposition accessible to a broad audience. Our analysis establishes a connection between spin glasses and integral geometry but leaves open the problem of explaining the discrepancies.

LGOct 2, 2023
A Theoretical Analysis of the Test Error of Finite-Rank Kernel Ridge Regression

Tin Sum Cheng, Aurelien Lucchi, Ivan Dokmanić et al.

Existing statistical learning guarantees for general kernel regressors often yield loose bounds when used with finite-rank kernels. Yet, finite-rank kernels naturally appear in several machine learning problems, e.g.\ when fine-tuning a pre-trained deep neural network's last layer to adapt it to a novel task when performing transfer learning. We address this gap for finite-rank kernel ridge regression (KRR) by deriving sharp non-asymptotic upper and lower bounds for the KRR test error of any finite-rank KRR. Our bounds are tighter than previously derived bounds on finite-rank KRR, and unlike comparable results, they also remain valid for any regularization parameters.

LGApr 24, 2023
An Approximation Theory for Metric Space-Valued Functions With A View Towards Deep Learning

Anastasis Kratsios, Chong Liu, Matti Lassas et al.

Motivated by the developing mathematics of deep learning, we build universal functions approximators of continuous maps between arbitrary Polish metric spaces $\mathcal{X}$ and $\mathcal{Y}$ using elementary functions between Euclidean spaces as building blocks. Earlier results assume that the target space $\mathcal{Y}$ is a topological vector space. We overcome this limitation by ``randomization'': our approximators output discrete probability measures over $\mathcal{Y}$. When $\mathcal{X}$ and $\mathcal{Y}$ are Polish without additional structure, we prove very general qualitative guarantees; when they have suitable combinatorial structure, we prove quantitative guarantees for Hölder-like maps, including maps between finite graphs, solution operators to rough differential equations between certain Carnot groups, and continuous non-linear operators between Banach spaces arising in inverse problems. In particular, we show that the required number of Dirac measures is determined by the combinatorial structure of $\mathcal{X}$ and $\mathcal{Y}$. For barycentric $\mathcal{Y}$, including Banach spaces, $\mathbb{R}$-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric spaces, our approximators reduce to $\mathcal{Y}$-valued functions. When the Euclidean approximators are neural networks, our constructions generalize transformer networks, providing a new probabilistic viewpoint of geometric deep learning.

OCJul 6, 2022
Orthogonal Matrix Retrieval with Spatial Consensus for 3D Unknown-View Tomography

Shuai Huang, Mona Zehni, Ivan Dokmanić et al.

Unknown-view tomography (UVT) reconstructs a 3D density map from its 2D projections at unknown, random orientations. A line of work starting with Kam (1980) employs the method of moments (MoM) with rotation-invariant Fourier features to solve UVT in the frequency domain, assuming that the orientations are uniformly distributed. This line of work includes the recent orthogonal matrix retrieval (OMR) approaches based on matrix factorization, which, while elegant, either require side information about the density that is not available, or fail to be sufficiently robust. For OMR to break free from those restrictions, we propose to jointly recover the density map and the orthogonal matrices by requiring that they be mutually consistent. We regularize the resulting non-convex optimization problem by a denoised reference projection and a nonnegativity constraint. This is enabled by the new closed-form expressions for spatial autocorrelation features. Further, we design an easy-to-compute initial density map which effectively mitigates the non-convexity of the reconstruction problem. Experimental results show that the proposed OMR with spatial consensus is more robust and performs significantly better than the previous state-of-the-art OMR approach in the typical low-SNR scenario of 3D UVT.

LGDec 26, 2022
Homophily modulates double descent generalization in graph convolution networks

Cheng Shi, Liming Pan, Hong Hu et al.

Graph neural networks (GNNs) excel in modeling relational data such as biological, social, and transportation networks, but the underpinnings of their success are not well understood. Traditional complexity measures from statistical learning theory fail to account for observed phenomena like the double descent or the impact of relational semantics on generalization error. Motivated by experimental observations of ``transductive'' double descent in key networks and datasets, we use analytical tools from statistical physics and random matrix theory to precisely characterize generalization in simple graph convolution networks on the contextual stochastic block model. Our results illuminate the nuances of learning on homophilic versus heterophilic data and predict double descent whose existence in GNNs has been questioned by recent work. We show how risk is shaped by the interplay between the graph noise, feature noise, and the number of training labels. Our findings apply beyond stylized models, capturing qualitative trends in real-world GNNs and datasets. As a case in point, we use our analytic insights to improve performance of state-of-the-art graph convolution networks on heterophilic datasets.

LGApr 15, 2022
Conditional Injective Flows for Bayesian Imaging

AmirEhsan Khorashadizadeh, Konik Kothari, Leonardo Salsi et al.

Most deep learning models for computational imaging regress a single reconstructed image. In practice, however, ill-posedness, nonlinearity, model mismatch, and noise often conspire to make such point estimates misleading or insufficient. The Bayesian approach models images and (noisy) measurements as jointly distributed random vectors and aims to approximate the posterior distribution of unknowns. Recent variational inference methods based on conditional normalizing flows are a promising alternative to traditional MCMC methods, but they come with drawbacks: excessive memory and compute demands for moderate to high resolution images and underwhelming performance on hard nonlinear problems. In this work, we propose C-Trumpets -- conditional injective flows specifically designed for imaging problems, which greatly diminish these challenges. Injectivity reduces memory footprint and training time while low-dimensional latent space together with architectural innovations like fixed-volume-change layers and skip-connection revnet layers, C-Trumpets outperform regular conditional flow models on a variety of imaging and image restoration tasks, including limited-view CT and nonlinear inverse scattering, with a lower compute and memory budget. C-Trumpets enable fast approximation of point estimates like MMSE or MAP as well as physically-meaningful uncertainty quantification.

LGSep 14, 2022
Small Transformers Compute Universal Metric Embeddings

Anastasis Kratsios, Valentin Debarnot, Ivan Dokmanić

We study representations of data from an arbitrary metric space $\mathcal{X}$ in the space of univariate Gaussian mixtures with a transport metric (Delon and Desolneux 2020). We derive embedding guarantees for feature maps implemented by small neural networks called \emph{probabilistic transformers}. Our guarantees are of memorization type: we prove that a probabilistic transformer of depth about $n\log(n)$ and width about $n^2$ can bi-Hölder embed any $n$-point dataset from $\mathcal{X}$ with low metric distortion, thus avoiding the curse of dimensionality. We further derive probabilistic bi-Lipschitz guarantees, which trade off the amount of distortion and the probability that a randomly chosen pair of points embeds with that distortion. If $\mathcal{X}$'s geometry is sufficiently regular, we obtain stronger, bi-Lipschitz guarantees for all points in the dataset. As applications, we derive neural embedding guarantees for datasets from Riemannian manifolds, metric trees, and certain types of combinatorial graphs. When instead embedding into multivariate Gaussian mixtures, we show that probabilistic transformers can compute bi-Hölder embeddings with arbitrarily small distortion.

IVNov 18, 2022
Differentiable Uncalibrated Imaging

Sidharth Gupta, Konik Kothari, Valentin Debarnot et al.

We propose a differentiable imaging framework to address uncertainty in measurement coordinates such as sensor locations and projection angles. We formulate the problem as measurement interpolation at unknown nodes supervised through the forward operator. To solve it we apply implicit neural networks, also known as neural fields, which are naturally differentiable with respect to the input coordinates. We also develop differentiable spline interpolators which perform as well as neural networks, require less time to optimize and have well-understood properties. Differentiability is key as it allows us to jointly fit a measurement representation, optimize over the uncertain measurement coordinates, and perform image reconstruction which in turn ensures consistent calibration. We apply our approach to 2D and 3D computed tomography, and show that it produces improved reconstructions compared to baselines that do not account for the lack of calibration. The flexibility of the proposed framework makes it easy to extend to almost arbitrary imaging problems.

LGJan 8, 2023
Deep Injective Prior for Inverse Scattering

AmirEhsan Khorashadizadeh, Vahid Khorashadizadeh, Sepehr Eskandari et al.

In electromagnetic inverse scattering, the goal is to reconstruct object permittivity using scattered waves. While deep learning has shown promise as an alternative to iterative solvers, it is primarily used in supervised frameworks which are sensitive to distribution drift of the scattered fields, common in practice. Moreover, these methods typically provide a single estimate of the permittivity pattern, which may be inadequate or misleading due to noise and the ill-posedness of the problem. In this paper, we propose a data-driven framework for inverse scattering based on deep generative models. Our approach learns a low-dimensional manifold as a regularizer for recovering target permittivities. Unlike supervised methods that necessitate both scattered fields and target permittivities, our method only requires the target permittivities for training; it can then be used with any experimental setup. We also introduce a Bayesian framework for approximating the posterior distribution of the target permittivity, enabling multiple estimates and uncertainty quantification. Extensive experiments with synthetic and experimental data demonstrate that our framework outperforms traditional iterative solvers, particularly for strong scatterers, while achieving comparable reconstruction quality to state-of-the-art supervised learning methods like the U-Net.

CVJun 4, 2022
Implicit Neural Representation for Mesh-Free Inverse Obstacle Scattering

Tin Vlašić, Hieu Nguyen, AmirEhsan Khorashadizadeh et al.

Implicit representation of shapes as level sets of multilayer perceptrons has recently flourished in different shape analysis, compression, and reconstruction tasks. In this paper, we introduce an implicit neural representation-based framework for solving the inverse obstacle scattering problem in a mesh-free fashion. We express the obstacle shape as the zero-level set of a signed distance function which is implicitly determined by network parameters. To solve the direct scattering problem, we implement the implicit boundary integral method. It uses projections of the grid points in the tubular neighborhood onto the boundary to compute the PDE solution directly in the level-set framework. The proposed implicit representation conveniently handles the shape perturbation in the optimization process. To update the shape, we use PyTorch's automatic differentiation to backpropagate the loss function w.r.t. the network parameters, allowing us to avoid complex and error-prone manual derivation of the shape derivative. Additionally, we propose a deep generative model of implicit neural shape representations that can fit into the framework. The deep generative model effectively regularizes the inverse obstacle scattering problem, making it more tractable and robust, while yielding high-quality reconstruction results even in noise-corrupted setups.

IVDec 20, 2022
FunkNN: Neural Interpolation for Functional Generation

AmirEhsan Khorashadizadeh, Anadi Chaman, Valentin Debarnot et al.

Can we build continuous generative models which generalize across scales, can be evaluated at any coordinate, admit calculation of exact derivatives, and are conceptually simple? Existing MLP-based architectures generate worse samples than the grid-based generators with favorable convolutional inductive biases. Models that focus on generating images at different scales do better, but employ complex architectures not designed for continuous evaluation of images and derivatives. We take a signal-processing perspective and treat continuous image generation as interpolation from samples. Indeed, correctly sampled discrete images contain all information about the low spatial frequencies. The question is then how to extrapolate the spectrum in a data-driven way while meeting the above design criteria. Our answer is FunkNN -- a new convolutional network which learns how to reconstruct continuous images at arbitrary coordinates and can be applied to any image dataset. Combined with a discrete generative model it becomes a functional generator which can act as a prior in continuous ill-posed inverse problems. We show that FunkNN generates high-quality continuous images and exhibits strong out-of-distribution performance thanks to its patch-based design. We further showcase its performance in several stylized inverse problems with exact spatial derivatives.

LGMay 18
Function graph transformers universally approximate operators between function spaces

Takashi Furuya, David Mis, Ivan Dokmanić et al.

We study the approximation of nonlinear operators between function spaces by transformers. Our approach is to lift functions to measures supported on their graphs and leverage a recently introduced measure-theoretic view of transformers. A function $h$ is represented by its graph measure $γ_h$, with finite tokens $\{(x_j,h(x_j))\}_{j=1}^N$ being its empirical approximations. We show that this framework elegantly models discretization refinement via convergence of measures and provides a natural setting for operator learning. Within this framework, we introduce function graph transformers, a graph-preserving subclass of measure-theoretic transformers that maps graph measures to graph measures, which is to say that outputs remain single-valued functions. Crucially, this additional structure does not reduce generality: we prove that the resulting graph-preserving maps can be approximated by finite compositions of standard softmax self-attention layers and pointwise MLPs, yielding universal approximation results for broad classes of nonlinear operators. Unlike existing theoretical approaches to operator learning with transformers, the measure-theoretic framework also accommodates regularized negative-order Sobolev inputs for which discretization invariance is particularly challenging, as well as query points on different output domains. Overall, function graph transformers provide a continuum viewpoint and mathematical toolkit for transformer-based operator learning, clarifying the roles of positional encodings, graph structure, regularization, and ensuring consistency across discretizations.

GEO-PHOct 21, 2024Code
SeisLM: a Foundation Model for Seismic Waveforms

Tianlin Liu, Jannes Münchmeyer, Laura Laurenti et al.

We introduce the Seismic Language Model (SeisLM), a foundational model designed to analyze seismic waveforms -- signals generated by Earth's vibrations such as the ones originating from earthquakes. SeisLM is pretrained on a large collection of open-source seismic datasets using a self-supervised contrastive loss, akin to BERT in language modeling. This approach allows the model to learn general seismic waveform patterns from unlabeled data without being tied to specific downstream tasks. When fine-tuned, SeisLM excels in seismological tasks like event detection, phase-picking, onset time regression, and foreshock-aftershock classification. The code has been made publicly available on https://github.com/liutianlin0121/seisLM.

LGDec 10, 2025
Rates and architectures for learning geometrically non-trivial operators

T. Mitchell Roddenberry, Leo Tzou, Ivan Dokmanić et al.

Deep learning methods have proven capable of recovering operators between high-dimensional spaces, such as solution maps of PDEs and similar objects in mathematical physics, from very few training samples. This phenomenon of data-efficiency has been proven for certain classes of elliptic operators with simple geometry, i.e., operators that do not change the domain of the function or propagate singularities. However, scientific machine learning is commonly used for problems that do involve the propagation of singularities in a priori unknown ways, such as waves, advection, and fluid dynamics. In light of this, we expand the learning theory to include double fibration transforms--geometric integral operators that include generalized Radon and geodesic ray transforms. We prove that this class of operators does not suffer from the curse of dimensionality: the error decays superalgebraically, that is, faster than any fixed power of the reciprocal of the number of training samples. Furthermore, we investigate architectures that explicitly encode the geometry of these transforms, demonstrating that an architecture reminiscent of cross-attention based on levelset methods yields a parameterization that is universal, stable, and learns double fibration transforms from very few training examples. Our results contribute to a rapidly-growing line of theoretical work on learning operators for scientific machine learning.

LGAug 13, 2024
Joint Graph Rewiring and Feature Denoising via Spectral Resonance

Jonas Linkerhägner, Cheng Shi, Ivan Dokmanić

When learning from graph data, the graph and the node features both give noisy information about the node labels. In this paper we propose an algorithm to jointly denoise the features and rewire the graph (JDR), which improves the performance of downstream node classification graph neural nets (GNNs). JDR works by aligning the leading spectral spaces of graph and feature matrices. It approximately solves the associated non-convex optimization problem in a way that handles graphs with multiple classes and different levels of homophily or heterophily. We theoretically justify JDR in a stylized setting and show that it consistently outperforms existing rewiring methods on a wide range of synthetic and real-world node classification tasks.

LGDec 8, 2022
Deep Variational Inverse Scattering

AmirEhsan Khorashadizadeh, Ali Aghababaei, Tin Vlašić et al.

Inverse medium scattering solvers generally reconstruct a single solution without an associated measure of uncertainty. This is true both for the classical iterative solvers and for the emerging deep learning methods. But ill-posedness and noise can make this single estimate inaccurate or misleading. While deep networks such as conditional normalizing flows can be used to sample posteriors in inverse problems, they often yield low-quality samples and uncertainty estimates. In this paper, we propose U-Flow, a Bayesian U-Net based on conditional normalizing flows, which generates high-quality posterior samples and estimates physically-meaningful uncertainty. We show that the proposed model significantly outperforms the recent normalizing flows in terms of posterior sample quality while having comparable performance with the U-Net in point estimation.

DIS-NNJul 28, 2024
Spring-block theory of feature learning in deep neural networks

Cheng Shi, Liming Pan, Ivan Dokmanić

Feature-learning deep nets progressively collapse data to a regular low-dimensional geometry. How this emerges from the collective action of nonlinearity, noise, learning rate, and other factors, has eluded first-principles theories built from microscopic neuronal dynamics. We exhibit a noise-nonlinearity phase diagram that identifies regimes where shallow or deep layers learn more effectively and propose a macroscopic mechanical theory that reproduces the diagram and links feature learning across layers to generalization.

GEO-PHJul 14, 2023
High-Rate Phase Association with Travel Time Neural Fields

Cheng Shi, Giulio Poggiali, Chris Marone et al.

Earthquake science and seismology rely on the ability to associate seismic waves with their originating earthquakes. Earthquake detection algorithms based on deep learning have progressed rapidly and now routinely detect microearthquakes with unprecedented clarity, providing information about fault dynamics on increasingly finer spatiotemporal scales. However, this densification of detections can overwhelm existing techniques for phase association which rely on fixed wave speed models and associate events one by one. These methods fail when the event rates become high or where the 4D complexity of elastic wave speeds cannot be ignored. Here, we introduce HARPA, a deep learning solution to this problem. HARPA is a high-rate association framework which incorporates wave physics by leveraging deep generative models and travel time neural fields. Instead of associating events one by one, it lifts arrival sequences to probability distributions and compares them using an optimal transport metric. The generative travel time neural fields are used to estimate the wave speed simultaneously with association. HARPA outperforms state-of-the-art association methods for both real seismic data and complex synthetic models and paves the way for improved understanding of seismicity while establishing a new seismic data analysis paradigm.

CVJan 1, 2024Code
Glimpse: Generalized Locality for Scalable and Robust CT

AmirEhsan Khorashadizadeh, Valentin Debarnot, Tianlin Liu et al.

Deep learning has become the state-of-the-art approach to medical tomographic imaging. A common approach is to feed the result of a simple inversion, for example the backprojection, to a multiscale convolutional neural network (CNN) which computes the final reconstruction. Despite good results on in-distribution test data, this often results in overfitting certain large-scale structures and poor generalization on out-of-distribution (OOD) samples. Moreover, the memory and computational complexity of multiscale CNNs scale unfavorably with image resolution, making them impractical for application at realistic clinical resolutions. In this paper, we introduce Glimpse, a local coordinate-based neural network for computed tomography which reconstructs a pixel value by processing only the measurements associated with the neighborhood of the pixel. Glimpse significantly outperforms successful CNNs on OOD samples, while achieving comparable or better performance on in-distribution test data and maintaining a memory footprint almost independent of image resolution; 5GB memory suffices to train on 1024x1024 images which is orders of magnitude less than CNNs. Glimpse is fully differentiable and can be used plug-and-play in arbitrary deep learning architectures, enabling feats such as correcting miscalibrated projection orientations. Our implementation and Google Colab demo can be accessed at https://github.com/swing-research/Glimpse.

STMay 8
On Observation Time for Recovering Latent Hawkes Networks

Jonas Linkerhägner, Michele Bortolasi, Lorenzo Baldassari et al.

Dynamics of interacting systems in engineering, society, and nature often evolve over latent networks that govern which entities can interact. We study the problem of inferring these networks from event-based observations, which arise naturally in finance, seismology, and neuroscience. While there is substantial algorithmic work addressing this important problem, theoretical results are scarce. In this paper we ask the following fundamental question: what is the minimum time that one must observe the dynamics in order to exactly recover the underlying network, as a function of the number $d$ of interacting entities? For a class of stationary Hawkes processes with sparse, weak interactions, we prove that an observation time of order $\log d$ is sufficient and necessary. For the upper bound we construct a two-stage estimator that uses clipped and binned event data for screening, followed by a least-squares refinement, and apply concentration bounds derived from the Poisson cluster representation. For the lower bound we combine Fano's inequality with Jacod's Girsanov formula for point processes on a suitable subclass of networks.

LGJan 12, 2025
Neural equilibria for long-term prediction of nonlinear conservation laws

J. Antonio Lara Benitez, Junyi Guo, Kareem Hegazy et al.

We introduce Neural Discrete Equilibrium (NeurDE), a machine learning (ML) approach for long-term forecasting of flow phenomena that relies on a "lifting" of physical conservation laws into the framework of kinetic theory. The kinetic formulation provides an excellent structure for ML algorithms by separating nonlinear, non-local physics into a nonlinear but local relaxation to equilibrium and a linear non-local transport. This separation allows the ML to focus on the local nonlinear components while addressing the simpler linear transport with efficient classical numerical algorithms. To accomplish this, we design an operator network that maps macroscopic observables to equilibrium states in a manner that maximizes entropy, yielding expressive BGK-type collisions. By incorporating our surrogate equilibrium into the lattice Boltzmann (LB) algorithm, we achieve accurate flow forecasts for a wide range of challenging flows. We show that NeurDE enables accurate prediction of compressible flows, including supersonic flows, while tracking shocks over hundreds of time steps, using a small velocity lattice-a heretofore unattainable feat without expensive numerical root finding.

CVNov 7, 2024
LoFi: Neural Local Fields for Scalable Image Reconstruction

AmirEhsan Khorashadizadeh, Tobías I. Liaudat, Tianlin Liu et al.

Neural fields or implicit neural representations (INRs) have attracted significant attention in computer vision and imaging due to their efficient coordinate-based representation of images and 3D volumes. In this work, we introduce a coordinate-based framework for solving imaging inverse problems, termed LoFi (Local Field). Unlike conventional methods for image reconstruction, LoFi processes local information at each coordinate separately by multi-layer perceptrons (MLPs), recovering the object at that specific coordinate. Similar to INRs, LoFi can recover images at any continuous coordinate, enabling image reconstruction at multiple resolutions. With comparable or better performance than standard deep learning models like convolutional neural networks (CNNs) and vision transformers (ViTs), LoFi achieves excellent generalization to out-of-distribution data with memory usage almost independent of image resolution. Remarkably, training on 1024x1024 images requires less than 200MB of memory -- much below standard CNNs and ViTs. Additionally, LoFi's local design allows it to train on extremely small datasets with 10 samples or fewer, without overfitting and without the need for explicit regularization or early stopping.

LGOct 8, 2021
Neural Link Prediction with Walk Pooling

Liming Pan, Cheng Shi, Ivan Dokmanić

Graph neural networks achieve high accuracy in link prediction by jointly leveraging graph topology and node attributes. Topology, however, is represented indirectly; state-of-the-art methods based on subgraph classification label nodes with distance to the target link, so that, although topological information is present, it is tempered by pooling. This makes it challenging to leverage features like loops and motifs associated with network formation mechanisms. We propose a link prediction algorithm based on a new pooling scheme called WalkPool. WalkPool combines the expressivity of topological heuristics with the feature-learning ability of neural networks. It summarizes a putative link by random walk probabilities of adjacent paths. Instead of extracting transition probabilities from the original graph, it computes the transition matrix of a "predictive" latent graph by applying attention to learned features; this may be interpreted as feature-sensitive topology fingerprinting. WalkPool can leverage unsupervised node features or be combined with GNNs and trained end-to-end. It outperforms state-of-the-art methods on all common link prediction benchmarks, both homophilic and heterophilic, with and without node attributes. Applying WalkPool to a set of unsupervised GNNs significantly improves prediction accuracy, suggesting that it may be used as a general-purpose graph pooling scheme.

LGOct 8, 2021
Universal Joint Approximation of Manifolds and Densities by Simple Injective Flows

Michael Puthawala, Matti Lassas, Ivan Dokmanić et al.

We study approximation of probability measures supported on $n$-dimensional manifolds embedded in $\mathbb{R}^m$ by injective flows -- neural networks composed of invertible flows and injective layers. We show that in general, injective flows between $\mathbb{R}^n$ and $\mathbb{R}^m$ universally approximate measures supported on images of extendable embeddings, which are a subset of standard embeddings: when the embedding dimension m is small, topological obstructions may preclude certain manifolds as admissible targets. When the embedding dimension is sufficiently large, $m \ge 3n+1$, we use an argument from algebraic topology known as the clean trick to prove that the topological obstructions vanish and injective flows universally approximate any differentiable embedding. Along the way we show that the studied injective flows admit efficient projections on the range, and that their optimality can be established "in reverse," resolving a conjecture made in Brehmer and Cranmer 2020.

LGOct 7, 2021
Universal Approximation Under Constraints is Possible with Transformers

Anastasis Kratsios, Behnoosh Zamanlooy, Tianlin Liu et al.

Many practical problems need the output of a machine learning model to satisfy a set of constraints, $K$. Nevertheless, there is no known guarantee that classical neural network architectures can exactly encode constraints while simultaneously achieving universality. We provide a quantitative constrained universal approximation theorem which guarantees that for any non-convex compact set $K$ and any continuous function $f:\mathbb{R}^n\rightarrow K$, there is a probabilistic transformer $\hat{F}$ whose randomized outputs all lie in $K$ and whose expected output uniformly approximates $f$. Our second main result is a "deep neural version" of Berge's Maximum Theorem (1963). The result guarantees that given an objective function $L$, a constraint set $K$, and a family of soft constraint sets, there is a probabilistic transformer $\hat{F}$ that approximately minimizes $L$ and whose outputs belong to $K$; moreover, $\hat{F}$ approximately satisfies the soft constraints. Our results imply the first universal approximation theorem for classical transformers with exact convex constraint satisfaction. They also yield that a chart-free universal approximation theorem for Riemannian manifold-valued functions subject to suitable geodesically convex constraints.

CVMay 9, 2021
Truly shift-equivariant convolutional neural networks with adaptive polyphase upsampling

Anadi Chaman, Ivan Dokmanić

Convolutional neural networks lack shift equivariance due to the presence of downsampling layers. In image classification, adaptive polyphase downsampling (APS-D) was recently proposed to make CNNs perfectly shift invariant. However, in networks used for image reconstruction tasks, it can not by itself restore shift equivariance. We address this problem by proposing adaptive polyphase upsampling (APS-U), a non-linear extension of conventional upsampling, which allows CNNs with symmetric encoder-decoder architecture (for example U-Net) to exhibit perfect shift equivariance. With MRI and CT reconstruction experiments, we show that networks containing APS-D/U layers exhibit state of the art equivariance performance without sacrificing on image reconstruction quality. In addition, unlike prior methods like data augmentation and anti-aliasing, the gains in equivariance obtained from APS-D/U also extend to images outside the training distribution.

LGFeb 20, 2021
Trumpets: Injective Flows for Inference and Inverse Problems

Konik Kothari, AmirEhsan Khorashadizadeh, Maarten de Hoop et al.

We propose injective generative models called Trumpets that generalize invertible normalizing flows. The proposed generators progressively increase dimension from a low-dimensional latent space. We demonstrate that Trumpets can be trained orders of magnitudes faster than standard flows while yielding samples of comparable or better quality. They retain many of the advantages of the standard flows such as training based on maximum likelihood and a fast, exact inverse of the generator. Since Trumpets are injective and have fast inverses, they can be effectively used for downstream Bayesian inference. To wit, we use Trumpet priors for maximum a posteriori estimation in the context of image reconstruction from compressive measurements, outperforming competitive baselines in terms of reconstruction quality and speed. We then propose an efficient method for posterior characterization and uncertainty quantification with Trumpets by taking advantage of the low-dimensional latent space.

SPFeb 1, 2021
Total Least Squares Phase Retrieval

Sidharth Gupta, Ivan Dokmanić

We address the phase retrieval problem with errors in the sensing vectors. A number of recent methods for phase retrieval are based on least squares (LS) formulations which assume errors in the quadratic measurements. We extend this approach to handle errors in the sensing vectors by adopting the total least squares (TLS) framework that is used in linear inverse problems with operator errors. We show how gradient descent and the specific geometry of the phase retrieval problem can be used to obtain a simple and efficient TLS solution. Additionally, we derive the gradients of the TLS and LS solutions with respect to the sensing vectors and measurements which enables us to calculate the solution errors. By analyzing these error expressions we determine conditions under which each method should outperform the other. We run simulations to demonstrate that our method can lead to more accurate solutions. We further demonstrate the effectiveness of our approach by performing phase retrieval experiments on real optical hardware which naturally contains both sensing vector and measurement errors.

CVNov 28, 2020
Truly shift-invariant convolutional neural networks

Anadi Chaman, Ivan Dokmanić

Thanks to the use of convolution and pooling layers, convolutional neural networks were for a long time thought to be shift-invariant. However, recent works have shown that the output of a CNN can change significantly with small shifts in input: a problem caused by the presence of downsampling (stride) layers. The existing solutions rely either on data augmentation or on anti-aliasing, both of which have limitations and neither of which enables perfect shift invariance. Additionally, the gains obtained from these methods do not extend to image patterns not seen during training. To address these challenges, we propose adaptive polyphase sampling (APS), a simple sub-sampling scheme that allows convolutional neural networks to achieve 100% consistency in classification performance under shifts, without any loss in accuracy. With APS, the networks exhibit perfect consistency to shifts even before training, making it the first approach that makes convolutional neural networks truly shift-invariant.

CVNov 25, 2020
Learning Multiscale Convolutional Dictionaries for Image Reconstruction

Tianlin Liu, Anadi Chaman, David Belius et al.

Convolutional neural networks (CNNs) have been tremendously successful in solving imaging inverse problems. To understand their success, an effective strategy is to construct simpler and mathematically more tractable convolutional sparse coding (CSC) models that share essential ingredients with CNNs. Existing CSC methods, however, underperform leading CNNs in challenging inverse problems. We hypothesize that the performance gap may be attributed in part to how they process images at different spatial scales: While many CNNs use multiscale feature representations, existing CSC models mostly rely on single-scale dictionaries. To close the performance gap, we thus propose a multiscale convolutional dictionary structure. The proposed dictionary structure is derived from the U-Net, arguably the most versatile and widely used CNN for image-to-image learning problems. We show that incorporating the proposed multiscale dictionary in an otherwise standard CSC framework yields performance competitive with state-of-the-art CNNs across a range of challenging inverse problems including CT and MRI reconstruction. Our work thus demonstrates the effectiveness and scalability of the multiscale CSC approach in solving challenging inverse problems.

LGJun 17, 2020
Geometry of Similarity Comparisons

Puoya Tabaghi, Jianhao Peng, Olgica Milenkovic et al.

Many data analysis problems can be cast as distance geometry problems in \emph{space forms} -- Euclidean, spherical, or hyperbolic spaces. Often, absolute distance measurements are often unreliable or simply unavailable and only proxies to absolute distances in the form of similarities are available. Hence we ask the following: Given only \emph{comparisons} of similarities amongst a set of entities, what can be said about the geometry of the underlying space form? To study this question, we introduce the notions of the \textit{ordinal capacity} of a target space form and \emph{ordinal spread} of the similarity measurements. The latter is an indicator of complex patterns in the measurements, while the former quantifies the capacity of a space form to accommodate a set of measurements with a specific ordinal spread profile. We prove that the ordinal capacity of a space form is related to its dimension and the sign of its curvature. This leads to a lower bound on the Euclidean and spherical embedding dimension of what we term similarity graphs. More importantly, we show that the statistical behavior of the ordinal spread random variables defined on a similarity graph can be used to identify its underlying space form. We support our theoretical claims with experiments on weighted trees, single-cell RNA expression data and spherical cartographic measurements.

LGJun 15, 2020
Globally Injective ReLU Networks

Michael Puthawala, Konik Kothari, Matti Lassas et al.

Injectivity plays an important role in generative models where it enables inference; in inverse problems and compressed sensing with generative priors it is a precursor to well posedness. We establish sharp characterizations of injectivity of fully-connected and convolutional ReLU layers and networks. First, through a layerwise analysis, we show that an expansivity factor of two is necessary and sufficient for injectivity by constructing appropriate weight matrices. We show that global injectivity with iid Gaussian matrices, a commonly used tractable model, requires larger expansivity between 3.4 and 10.5. We also characterize the stability of inverting an injective network via worst-case Lipschitz constants of the inverse. We then use arguments from differential topology to study injectivity of deep networks and prove that any Lipschitz map can be approximated by an injective ReLU network. Finally, using an argument based on random projections, we show that an end-to-end -- rather than layerwise -- doubling of the dimension suffices for injectivity. Our results establish a theoretical basis for the study of nonlinear inverse and inference problems using neural networks.

IVJun 10, 2020
Learning the geometry of wave-based imaging

Konik Kothari, Maarten de Hoop, Ivan Dokmanić

We propose a general physics-based deep learning architecture for wave-based imaging problems. A key difficulty in imaging problems with a varying background wave speed is that the medium "bends" the waves differently depending on their position and direction. This space-bending geometry makes the equivariance to translations of convolutional networks an undesired inductive bias. We build an interpretable neural architecture inspired by Fourier integral operators (FIOs) which approximate the wave physics. FIOs model a wide range of imaging modalities, from seismology and radar to Doppler and ultrasound. We focus on learning the geometry of wave propagation captured by FIOs, which is implicit in the data, via a loss based on optimal transport. The proposed FIONet performs significantly better than the usual baselines on a number of imaging inverse problems, especially in out-of-distribution tests.

LGMay 18, 2020
Hyperbolic Distance Matrices

Puoya Tabaghi, Ivan Dokmanić

Hyperbolic space is a natural setting for mining and visualizing data with hierarchical structure. In order to compute a hyperbolic embedding from comparison or similarity information, one has to solve a hyperbolic distance geometry problem. In this paper, we propose a unified framework to compute hyperbolic embeddings from an arbitrary mix of noisy metric and non-metric data. Our algorithms are based on semidefinite programming and the notion of a hyperbolic distance matrix, in many ways parallel to its famous Euclidean counterpart. A central ingredient we put forward is a semidefinite characterization of the hyperbolic Gramian -- a matrix of Lorentzian inner products. This characterization allows us to formulate a semidefinite relaxation to efficiently compute hyperbolic embeddings in two stages: first, we complete and denoise the observed hyperbolic distance matrix; second, we propose a spectral factorization method to estimate the embedded points from the hyperbolic distance matrix. We show through numerical experiments how the flexibility to mix metric and non-metric constraints allows us to efficiently compute embeddings from arbitrary data.

LGOct 9, 2019
The fastest $\ell_{1,\infty}$ prox in the west

Benjamín Béjar, Ivan Dokmanić, René Vidal

Proximal operators are of particular interest in optimization problems dealing with non-smooth objectives because in many practical cases they lead to optimization algorithms whose updates can be computed in closed form or very efficiently. A well-known example is the proximal operator of the vector $\ell_1$ norm, which is given by the soft-thresholding operator. In this paper we study the proximal operator of the mixed $\ell_{1,\infty}$ matrix norm and show that it can be computed in closed form by applying the well-known soft-thresholding operator to each column of the matrix. However, unlike the vector $\ell_1$ norm case where the threshold is constant, in the mixed $\ell_{1,\infty}$ norm case each column of the matrix might require a different threshold and all thresholds depend on the given matrix. We propose a general iterative algorithm for computing these thresholds, as well as two efficient implementations that further exploit easy to compute lower bounds for the mixed norm of the optimal solution. Experiments on large-scale synthetic and real data indicate that the proposed methods can be orders of magnitude faster than state-of-the-art methods.

LGJul 3, 2019
Don't take it lightly: Phasing optical random projections with unknown operators

Sidharth Gupta, Rémi Gribonval, Laurent Daudet et al.

In this paper we tackle the problem of recovering the phase of complex linear measurements when only magnitude information is available and we control the input. We are motivated by the recent development of dedicated optics-based hardware for rapid random projections which leverages the propagation of light in random media. A signal of interest $\mathbfξ \in \mathbb{R}^N$ is mixed by a random scattering medium to compute the projection $\mathbf{y} = \mathbf{A} \mathbfξ$, with $\mathbf{A} \in \mathbb{C}^{M \times N}$ being a realization of a standard complex Gaussian iid random matrix. Such optics-based matrix multiplications can be much faster and energy-efficient than their CPU or GPU counterparts, yet two difficulties must be resolved: only the intensity ${|\mathbf{y}|}^2$ can be recorded by the camera, and the transmission matrix $\mathbf{A}$ is unknown. We show that even without knowing $\mathbf{A}$, we can recover the unknown phase of $\mathbf{y}$ for some equivalent transmission matrix with the same distribution as $\mathbf{A}$. Our method is based on two observations: first, conjugating or changing the phase of any row of $\mathbf{A}$ does not change its distribution; and second, since we control the input we can interfere $\mathbfξ$ with arbitrary reference signals. We show how to leverage these observations to cast the measurement phase retrieval problem as a Euclidean distance geometry problem. We demonstrate appealing properties of the proposed algorithm in both numerical simulations and real hardware experiments. Not only does our algorithm accurately recover the missing phase, but it mitigates the effects of quantization and the sensitivity threshold, thus improving the measured magnitudes.

MLJan 29, 2019
Learning Schatten--von Neumann Operators

Puoya Tabaghi, Maarten de Hoop, Ivan Dokmanić

We study the learnability of a class of compact operators known as Schatten--von Neumann operators. These operators between infinite-dimensional function spaces play a central role in a variety of applications in learning theory and inverse problems. We address the question of sample complexity of learning Schatten-von Neumann operators and provide an upper bound on the number of measurements required for the empirical risk minimizer to generalize with arbitrary precision and probability, as a function of class parameter $p$. Our results give generalization guarantees for regression of infinite-dimensional signals from infinite-dimensional data. Next, we adapt the representer theorem of Abernethy \emph{et al.} to show that empirical risk minimization over an a priori infinite-dimensional, non-compact set, can be converted to a convex finite dimensional optimization problem over a compact set. In summary, the class of $p$-Schatten--von Neumann operators is probably approximately correct (PAC)-learnable via a practical convex program for any $p < \infty$.

ASNov 17, 2018
Multipath-enabled private audio with noise

Anadi Chaman, Yu-Jeh Liu, Jonah Casebeer et al.

We address the problem of privately communicating audio messages to multiple listeners in a reverberant room using a set of loudspeakers. We propose two methods based on emitting noise. In the first method, the loudspeakers emit noise signals that are appropriately filtered so that after echoing along multiple paths in the room, they sum up and descramble to yield distinct meaningful audio messages only at specific focusing spots, while being incoherent everywhere else. In the second method, adapted from wireless communications, we project noise signals onto the nullspace of the MIMO channel matrix between the loudspeakers and listeners. Loudspeakers reproduce a sum of the projected noise signals and intended messages. Again because of echoes, the MIMO nullspace changes across different locations in the room. Thus, the listeners at focusing spots hear intended messages, while the acoustic channel of an eavesdropper at any other location is jammed. We show, using both numerical and real experiments, that with a small number of speakers and a few impulse response measurements, audio messages can indeed be communicated to a set of listeners while ensuring negligible intelligibility elsewhere.

SDSep 16, 2018
Cocktails, but no party: multipath-enabled private audio

Yu-Jeh Liu, Jonah Casebeer, Ivan Dokmanić

We describe a private audio messaging system that uses echoes to unscramble messages at a few predetermined locations in a room. The system works by splitting the audio into short chunks and emitting them from different loudspeakers. The chunks are filtered so that as they echo around the room, they sum to noise everywhere except at a few chosen focusing spots where they exactly reproduce the intended messages. Unlike in the case of standard personal audio zones, the proposed method renders sound outside the focusing spots unintelligible. Our method essentially depends on echoes: the room acts as a mixing system such that at given points we get the desired output. Finally, we only require a modest number of loudspeakers and only a few impulse response measurements at points where the messages should be delivered. We demonstrate the effectiveness of the proposed method via objective quantitative metrics as well as informal listening experiments in a real room.

CVMay 29, 2018
Random mesh projectors for inverse problems

Sidharth Gupta, Konik Kothari, Maarten V. de Hoop et al.

We propose a new learning-based approach to solve ill-posed inverse problems in imaging. We address the case where ground truth training samples are rare and the problem is severely ill-posed - both because of the underlying physics and because we can only get few measurements. This setting is common in geophysical imaging and remote sensing. We show that in this case the common approach to directly learn the mapping from the measured data to the reconstruction becomes unstable. Instead, we propose to first learn an ensemble of simpler mappings from the data to projections of the unknown image into random piecewise-constant subspaces. We then combine the projections to form a final reconstruction by solving a deconvolution-like problem. We show experimentally that the proposed method is more robust to measurement noise and corruptions not seen during training than a directly learned inverse.

DSApr 6, 2018
Reconstructing Point Sets from Distance Distributions

Shuai Huang, Ivan Dokmanić

We address the problem of reconstructing a set of points on a line or a loop from their unassigned noisy pairwise distances. When the points lie on a line, the problem is known as the turnpike; when they are on a loop, it is known as the beltway. We approximate the problem by discretizing the domain and representing the $N$ points via an $N$-hot encoding, which is a density supported on the discretized domain. We show how the distance distribution is then simply a collection of quadratic functionals of this density and propose to recover the point locations so that the estimated distance distribution matches the measured distance distribution. This can be cast as a constrained nonconvex optimization problem which we solve using projected gradient descent with a suitable spectral initializer. We derive conditions under which the proposed distance distribution matching approach locally converges to a global optimizer at a linear rate. Compared to the conventional backtracking approach, our method jointly reconstructs all the point locations and is robust to noise in the measurements. We substantiate these claims with state-of-the-art performance across a number of numerical experiments. Our method is the first practical approach to solve the large-scale noisy beltway problem where the points lie on a loop.

ASJan 11, 2018
Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization

Dalia El Badawy, Ivan Dokmanić

Conventional approaches to sound source localization require at least two microphones. It is known, however, that people with unilateral hearing loss can also localize sounds. Monaural localization is possible thanks to the scattering by the head, though it hinges on learning the spectra of the various sources. We take inspiration from this human ability to propose algorithms for accurate sound source localization using a single microphone embedded in an arbitrary scattering structure. The structure modifies the frequency response of the microphone in a direction-dependent way giving each direction a signature. While knowing those signatures is sufficient to localize sources of white noise, localizing speech is much more challenging: it is an ill-posed inverse problem which we regularize by prior knowledge in the form of learned non-negative dictionaries. We demonstrate a monaural speech localization algorithm based on non-negative matrix factorization that does not depend on sophisticated, designed scatterers. In fact, we show experimental results with ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures we can accurately localize arbitrary speakers; that is, we do not need to learn the dictionary for the particular speaker to be localized. Finally, we discuss multi-source localization and the related limitations of our approach.

SDNov 18, 2017
Separake: Source Separation with a Little Help From Echoes

Robin Scheibler, Diego Di Carlo, Antoine Deleforge et al.

It is commonly believed that multipath hurts various audio processing algorithms. At odds with this belief, we show that multipath in fact helps sound source separation, even with very simple propagation models. Unlike most existing methods, we neither ignore the room impulse responses, nor we attempt to estimate them fully. We rather assume that we know the positions of a few virtual microphones generated by echoes and we show how this gives us enough spatial diversity to get a performance boost over the anechoic case. We show improvements for two standard algorithms---one that uses only magnitudes of the transfer functions, and one that also uses the phases. Concretely, we show that multichannel non-negative matrix factorization aided with a small number of echoes beats the vanilla variant of the same algorithm, and that with magnitude information only, echoes enable separation where it was previously impossible.

SDOct 11, 2017
Pyroomacoustics: A Python package for audio room simulations and array processing algorithms

Robin Scheibler, Eric Bezzam, Ivan Dokmanić

We present pyroomacoustics, a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the package can be divided into three main components: an intuitive Python object-oriented interface to quickly construct different simulation scenarios involving multiple sound sources and microphones in 2D and 3D rooms; a fast C implementation of the image source model for general polyhedral rooms to efficiently generate room impulse responses and simulate the propagation between sources and receivers; and finally, reference implementations of popular algorithms for beamforming, direction finding, and adaptive filtering. Together, they form a package with the potential to speed up the time to market of new algorithms by significantly reducing the implementation overhead in the performance evaluation step.

ROSep 18, 2016
Omnidirectional Bats, Point-to-Plane Distances, and the Price of Uniqueness

Miranda Kreković, Ivan Dokmanić, Martin Vetterli

We study simultaneous localization and mapping with a device that uses reflections to measure its distance from walls. Such a device can be realized acoustically with a synchronized collocated source and receiver; it behaves like a bat with no capacity for directional hearing or vocalizing. In this paper we generalize our previous work in 2D, and show that the 3D case is not just a simple extension, but rather a fundamentally different inverse problem. While generically the 2D problem has a unique solution, in 3D uniqueness is always absent in rooms with fewer than nine walls. In addition to the complete characterization of ambiguities which arise due to this non-uniqueness, we propose a robust solution for inexact measurements similar to analogous results for Euclidean Distance Matrices. Our theoretical results have important consequences for the design of collocated range-only SLAM systems, and we support them with an array of computer experiments.

SDJul 21, 2014
Raking the Cocktail Party

Ivan Dokmanić, Robin Scheibler, Martin Vetterli

We present the concept of an acoustic rake receiver---a microphone beamformer that uses echoes to improve the noise and interference suppression. The rake idea is well-known in wireless communications; it involves constructively combining different multipath components that arrive at the receiver antennas. Unlike spread-spectrum signals used in wireless communications, speech signals are not orthogonal to their shifts. Therefore, we focus on the spatial structure, rather than temporal. Instead of explicitly estimating the channel, we create correspondences between early echoes in time and image sources in space. These multiple sources of the desired and the interfering signal offer additional spatial diversity that we can exploit in the beamformer design. We present several "intuitive" and optimal formulations of acoustic rake receivers, and show theoretically and numerically that the rake formulation of the maximum signal-to-interference-and-noise beamformer offers significant performance boosts in terms of noise and interference suppression. Beyond signal-to-noise ratio, we observe gains in terms of the \emph{perceptual evaluation of speech quality} (PESQ) metric for the speech quality. We accompany the paper by the complete simulation and processing chain written in Python. The code and the sound samples are available online at \url{http://lcav.github.io/AcousticRakeReceiver/}.