Tamir Bendory

h-index11

9papers

56citations

Novelty51%

AI Score46

Ranked #36,426 of 194,257 authors (top 19%)#269 in IV (top 6%)

9 Papers

5.5MLAug 19, 2024

Confirmation Bias in Gaussian Mixture Models

Amnon Balanov, Tamir Bendory, Wasim Huleihel

Confirmation bias, the tendency to interpret information in a way that aligns with one's preconceptions, can profoundly impact scientific research, leading to conclusions that reflect the researcher's hypotheses even when the observational data do not support them. This issue is especially critical in scientific fields involving highly noisy observations, such as cryo-electron microscopy. This study investigates confirmation bias in Gaussian mixture models. We consider the following experiment: A team of scientists assumes they are analyzing data drawn from a Gaussian mixture model with known signals (hypotheses) as centroids. However, in reality, the observations consist entirely of noise without any informative structure. The researchers use a single iteration of the K-means or expectation-maximization algorithms, two popular algorithms to estimate the centroids. Despite the observations being pure noise, we show that these algorithms yield biased estimates that resemble the initial hypotheses, contradicting the unbiased expectation that averaging these noise observations would converge to zero. Namely, the algorithms generate estimates that mirror the postulated model, although the hypotheses (the presumed centroids of the Gaussian mixture) are not evident in the observations. Specifically, among other results, we prove a positive correlation between the estimates produced by the algorithms and the corresponding hypotheses. We also derive explicit closed-form expressions of the estimates for a finite and infinite number of hypotheses. This study underscores the risks of confirmation bias in low signal-to-noise environments, provides insights into potential pitfalls in scientific methodologies, and highlights the importance of prudent data interpretation.

6.8STMay 28

The generalized method of moments is (almost) statistically efficient in low-SNR Gaussian latent-variable models

Amnon Balanov, Tamir Bendory, Dan Edidin

We study estimation in the low signal-to-noise ratio (SNR) regime for a broad class of Gaussian latent-variable models, including Gaussian mixtures and orbit recovery problems. We show that, in this regime, the generalized method-of-moments (GMoM) matches the first-order asymptotic efficiency of maximum likelihood. In particular, if the moment features are chosen up to the minimal local order required for identification and are weighted optimally, then the resulting GMoM estimator has the same leading asymptotic covariance as the maximum-likelihood estimator. Our analysis shows that, in low SNR, this equivalence is governed by a layered local geometry: different directions become informative at different moment orders, partitioning the space into layers with distinct SNR scalings. We prove that the observed Fisher information and the GMoM information operator admit matching layerwise expansions across these layers. As a consequence, in the low-SNR regime, GMoM provides a statistically efficient alternative to maximum likelihood, while preserving the computational advantages of moment-based estimation.

8.1SPMar 30

Sample Complexity Analysis of Multi-Target Detection via Markovian and Hard-Core Multi-Reference Alignment

Kweku Abraham, Amnon Balanov, Tamir Bendory et al.

Motivated by single-particle cryo-electron microscopy, we study the sample complexity of the multi-target detection (MTD) problem, in which an unknown signal appears multiple times at unknown locations within a long, noisy observation. We propose a patching scheme that reduces MTD to a non-i.i.d. multi-reference alignment (MRA) model. In the one-dimensional setting, the latent group elements form a Markov chain, and we show that the convergence rate of any estimator matches that of the corresponding i.i.d. MRA model, up to a logarithmic factor in the number of patches. Moreover, for estimators based on empirical averaging, such as the method of moments, the convergence rates are identical in both settings. We further establish an analogous result in two dimensions, where the latent structure arises from an exponentially mixing random field generated by a hard-core placement model. As a consequence, if the signal in the corresponding i.i.d. MRA model is determined by moments up to order $n_{\min}$, then in the low-SNR regime the number of patches required to estimate the signal in the MTD model scales as $Ï^{2n_{\min}}$, where $Ï^2$ denotes the noise variance.

3.3ITFeb 15, 2022

On the Role of Channel Capacity in Learning Gaussian Mixture Models

Elad Romanov, Tamir Bendory, Or Ordentlich

This paper studies the sample complexity of learning the $k$ unknown centers of a balanced Gaussian mixture model (GMM) in $\mathbb{R}^d$ with spherical covariance matrix $σ^2\mathbf{I}$. In particular, we are interested in the following question: what is the maximal noise level $σ^2$, for which the sample complexity is essentially the same as when estimating the centers from labeled measurements? To that end, we restrict attention to a Bayesian formulation of the problem, where the centers are uniformly distributed on the sphere $\sqrt{d}\mathcal{S}^{d-1}$. Our main results characterize the exact noise threshold $σ^2$ below which the GMM learning problem, in the large system limit $d,k\to\infty$, is as easy as learning from labeled observations, and above which it is substantially harder. The threshold occurs at $\frac{\log k}{d} = \frac12\log\left( 1+\frac{1}{σ^2} \right)$, which is the capacity of the additive white Gaussian noise (AWGN) channel. Thinking of the set of $k$ centers as a code, this noise threshold can be interpreted as the largest noise level for which the error probability of the code over the AWGN channel is small. Previous works on the GMM learning problem have identified the minimum distance between the centers as a key parameter in determining the statistical difficulty of learning the corresponding GMM. While our results are only proved for GMMs whose centers are uniformly distributed over the sphere, they hint that perhaps it is the decoding error probability associated with the center constellation as a channel code that determines the statistical difficulty of learning the corresponding GMM, rather than just the minimum distance.

2.3SPMay 16, 2021Code

An accelerated expectation-maximization algorithm for multi-reference alignment

Noam Janco, Tamir Bendory

The multi-reference alignment (MRA) problem entails estimating an image from multiple noisy and rotated copies of itself. If the noise level is low, one can reconstruct the image by estimating the missing rotations, aligning the images, and averaging out the noise. While accurate rotation estimation is impossible if the noise level is high, the rotations can still be approximated, and thus can provide indispensable information. In particular, learning the approximation error can be harnessed for efficient image estimation. In this paper, we propose a new computational framework, called Synch-EM, that consists of angular synchronization followed by expectation-maximization (EM). The synchronization step results in a concentrated distribution of rotations; this distribution is learned and then incorporated into the EM as a Bayesian prior. The learned distribution also dramatically reduces the search space, and thus the computational load, of the EM iterations. We show by extensive numerical experiments that the proposed framework can significantly accelerate EM for MRA in high noise levels, occasionally by a few orders of magnitude, without degrading the reconstruction quality.

2.4IVJan 19, 2021

Multi-target detection with rotations

Tamir Bendory, Ti-Yen Lan, Nicholas F. Marshall et al.

We consider the multi-target detection problem of estimating a two-dimensional target image from a large noisy measurement image that contains many randomly rotated and translated copies of the target image. Motivated by single-particle cryo-electron microscopy, we focus on the low signal-to-noise regime, where it is difficult to estimate the locations and orientations of the target images in the measurement. Our approach uses autocorrelation analysis to estimate rotationally and translationally invariant features of the target image. We demonstrate that, regardless of the level of noise, our technique can be used to recover the target image when the measurement is sufficiently large.

5.1IVOct 22, 2019

Image recovery from rotational and translational invariants

Nicholas F. Marshall, Ti-Yen Lan, Tamir Bendory et al.

We introduce a framework for recovering an image from its rotationally and translationally invariant features based on autocorrelation analysis. This work is an instance of the multi-target detection statistical model, which is mainly used to study the mathematical and computational properties of single-particle reconstruction using cryo-electron microscopy (cryo-EM) at low signal-to-noise ratios. We demonstrate with synthetic numerical experiments that an image can be reconstructed from rotationally and translationally invariant features and show that the reconstruction is robust to noise. These results constitute an important step towards the goal of structure determination of small biomolecules using cryo-EM.

7.6IVOct 12, 2018Code

Heterogeneous multireference alignment for images with application to 2-D classification in single particle reconstruction

Chao Ma, Tamir Bendory, Nicolas Boumal et al.

Motivated by the task of 2-D classification in single particle reconstruction by cryo-electron microscopy (cryo-EM), we consider the problem of heterogeneous multireference alignment of images. In this problem, the goal is to estimate a (typically small) set of target images from a (typically large) collection of observations. Each observation is a rotated, noisy version of one of the target images. For each individual observation, neither the rotation nor which target image has been rotated are known. As the noise level in cryo-EM data is high, clustering the observations and estimating individual rotations is challenging. We propose a framework to estimate the target images directly from the observations, completely bypassing the need to cluster or register the images. The framework consists of two steps. First, we estimate rotation-invariant features of the images, such as the bispectrum. These features can be estimated to any desired accuracy, at any noise level, provided sufficiently many observations are collected. Then, we estimate the images from the invariant features. Numerical experiments on synthetic cryo-EM datasets demonstrate the effectiveness of the method. Ultimately, we outline future developments required to apply this method to experimental data.

1.9CVDec 12, 2014

Edge Preserving Multi-Modal Registration Based On Gradient Intensity Self-Similarity

Tamar Rott, Dorin Shriki, Tamir Bendory

Image registration is a challenging task in the world of medical imaging. Particularly, accurate edge registration plays a central role in a variety of clinical conditions. The Modality Independent Neighbourhood Descriptor (MIND) demonstrates state of the art alignment, based on the image self-similarity. However, this method appears to be less accurate regarding edge registration. In this work, we propose a new registration method, incorporating gradient intensity and MIND self-similarity metric. Experimental results show the superiority of this method in edge registration tasks, while preserving the original MIND performance for other image features and textures.