Roi Weiss

LG
h-index10
13papers
239citations
Novelty61%
AI Score31

13 Papers

NAMar 4, 2018
Newton correction methods for computing real eigenpairs of symmetric tensors

Ariel Jaffe, Roi Weiss, Boaz Nadler

Real eigenpairs of symmetric tensors play an important role in multiple applications. In this paper we propose and analyze a fast iterative Newton-based method to compute real eigenpairs of symmetric tensors. We derive sufficient conditions for a real eigenpair to be a stable fixed point for our method, and prove that given a sufficiently close initial guess, the convergence rate is quadratic. Empirically, our method converges to a significantly larger number of eigenpairs compared to previously proposed iterative methods, and with enough random initializations typically finds all real eigenpairs. In particular, for a generic symmetric tensor, the sufficient conditions for local convergence of our Newton-based method hold simultaneously for all its real eigenpairs.

LGOct 24, 2023
Weighted Distance Nearest Neighbor Condensing

Lee-Ad Gottlieb, Timor Sharabi, Roi Weiss

The problem of nearest neighbor condensing has enjoyed a long history of study, both in its theoretical and practical aspects. In this paper, we introduce the problem of weighted distance nearest neighbor condensing, where one assigns weights to each point of the condensed set, and then new points are labeled based on their weighted distance nearest neighbor in the condensed set. We study the theoretical properties of this new model, and show that it can produce dramatically better condensing than the standard nearest neighbor rule, yet is characterized by generalization bounds almost identical to the latter. We then suggest a condensing heuristic for our new problem. We demonstrate Bayes consistency for this heuristic, and also show promising empirical results.

LGJun 16, 2022
On Error and Compression Rates for Prototype Rules

Omer Kerem, Roi Weiss

We study the close interplay between error and compression in the non-parametric multiclass classification setting in terms of prototype learning rules. We focus in particular on a recently proposed compression-based learning rule termed OptiNet (Kontorovich, Sabato, and Urner 2016; Kontorovich, Sabato, and Weiss 2017; Hanneke et al. 2021). Beyond its computational merits, this rule has been recently shown to be universally consistent in any metric instance space that admits a universally consistent rule--the first learning algorithm known to enjoy this property. However, its error and compression rates have been left open. Here we derive such rates in the case where instances reside in Euclidean space under commonly posed smoothness and tail conditions on the data distribution. We first show that OptiNet achieves non-trivial compression rates while enjoying near minimax-optimal error rates. We then proceed to study a novel general compression scheme for further compressing prototype rules that locally adapts to the noise level without sacrificing accuracy. Applying it to OptiNet, we show that under a geometric margin condition, further gain in the compression rate is achieved. Experimental results comparing the performance of the various methods are presented.

PRFeb 16, 2024
Resilience of Rademacher chaos of low degree

Elad Aigner-Horev, Daniel Rosenberg, Roi Weiss

The resilience of a Rademacher chaos is the maximum number of adversarial sign-flips that the chaos can sustain without having its largest atom probability significantly altered. Inspired by probabilistic lower-bound guarantees for the resilience of linear Rademacher chaos, obtained by Bandeira, Ferber, and Kwan (Advances in Mathematics, Vol. $319$, $2017$), we provide probabilistic lower-bound guarantees for the resilience of Rademacher chaos of arbitrary yet sufficiently low degree. Our main results distinguish between Rademacher chaos of order two and those of higher order. In that, our first main result pertains to the resilience of decoupled bilinear Rademacher forms where different asymptotic behaviour is observed for sparse and dense matrices. For our second main result, we bootstrap our first result in order to provide resilience guarantees for quadratic Rademacher chaos. Our third main result, generalises the first and handles the resilience of decoupled Rademacher chaos of arbitrary yet sufficiently low order. Our results for decoupled Rademacher chaos of order two and that of higher order whilst are established through the same conceptual framework, differ substantially. A difference incurred due to the implementation of the same conceptual argument. The order two result is established using Dudley's maximal inequality for sub-Gaussian processes, the Hanson-Wright inequality, as well as the Kolmogorov-Rogozin inequality. To handle higher order chaos, appeals to Dudley's inequality as well as the Hanson-Wright inequality are replaced with tools suited for random tensors. Appeals to the Hanson-Wright inequality are replaced with appeals to a concentration result for random tensors put forth by Adamczak and Wolff. Our results are instance-dependent and thus allow for the efficient computation of resilience guarantees provided the order of the chaos is constant.

STNov 23, 2021
Tree density estimation

László Györfi, Aryeh Kontorovich, Roi Weiss

We study the problem of estimating the density $f(\boldsymbol x)$ of a random vector ${\boldsymbol X}$ in $\mathbb R^d$. For a spanning tree $T$ defined on the vertex set $\{1,\dots ,d\}$, the tree density $f_{T}$ is a product of bivariate conditional densities. An optimal spanning tree minimizes the Kullback-Leibler divergence between $f$ and $f_{T}$. From i.i.d. data we identify an optimal tree $T^*$ and efficiently construct a tree density estimate $f_n$ such that, without any regularity conditions on the density $f$, one has $\lim_{n\to \infty} \int |f_n(\boldsymbol x)-f_{T^*}(\boldsymbol x)|d\boldsymbol x=0$ a.s. For Lipschitz $f$ with bounded support, $\mathbb E \left\{ \int |f_n(\boldsymbol x)-f_{T^*}(\boldsymbol x)|d\boldsymbol x\right\}=O\big(n^{-1/4}\big)$, a dimension-free rate.

LGOct 1, 2020
Universal consistency and rates of convergence of multiclass prototype algorithms in metric spaces

László Györfi, Roi Weiss

We study universal consistency and convergence rates of simple nearest-neighbor prototype rules for the problem of multiclass classification in metric paces. We first show that a novel data-dependent partitioning rule, named Proto-NN, is universally consistent in any metric space that admits a universally consistent rule. Proto-NN is a significant simplification of OptiNet, a recently proposed compression-based algorithm that, to date, was the only algorithm known to be universally consistent in such a general setting. Practically, Proto-NN is simpler to implement and enjoys reduced computational complexity. We then proceed to study convergence rates of the excess error probability. We first obtain rates for the standard $k$-NN rule under a margin condition and a new generalized-Lipschitz condition. The latter is an extension of a recently proposed modified-Lipschitz condition from $\mathbb R^d$ to metric spaces. Similarly to the modified-Lipschitz condition, the new condition avoids any boundness assumptions on the data distribution. While obtaining rates for Proto-NN is left open, we show that a second prototype rule that hybridizes between $k$-NN and Proto-NN achieves the same rates as $k$-NN while enjoying similar computational advantages as Proto-NN. However, as $k$-NN, this hybrid rule is not consistent in general.

LGJun 24, 2019
Universal Bayes consistency in metric spaces

Steve Hanneke, Aryeh Kontorovich, Sivan Sabato et al.

We extend a recently proposed 1-nearest-neighbor based multiclass learning algorithm and prove that our modification is universally strongly Bayes-consistent in all metric spaces admitting any such learner, making it an "optimistically universal" Bayes-consistent learner. This is the first learning algorithm known to enjoy this property; by comparison, the $k$-NN classifier and its variants are not generally universally Bayes-consistent, except under additional structural assumptions, such as an inner product, a norm, finite dimension, or a Besicovitch-type property. The metric spaces in which universal Bayes consistency is possible are the "essentially separable" ones -- a notion that we define, which is more general than standard separability. The existence of metric spaces that are not essentially separable is widely believed to be independent of the ZFC axioms of set theory. We prove that essential separability exactly characterizes the existence of a universal Bayes-consistent learner for the given metric space. In particular, this yields the first impossibility result for universal Bayes consistency. Taken together, our results completely characterize strong and weak universal Bayes consistency in metric spaces.

MLFeb 27, 2018
Learning Binary Latent Variable Models: A Tensor Eigenpair Approach

Ariel Jaffe, Roi Weiss, Shai Carmi et al.

Latent variable models with hidden binary units appear in various applications. Learning such models, in particular in the presence of noise, is a challenging computational problem. In this paper we propose a novel spectral approach to this problem, based on the eigenvectors of both the second order moment matrix and third order moment tensor of the observed data. We prove that under mild non-degeneracy conditions, our method consistently estimates the model parameters at the optimal parametric rate. Our tensor-based method generalizes previous orthogonal tensor decomposition approaches, where the hidden units were assumed to be either statistically independent or mutually exclusive. We illustrate the consistency of our method on simulated data and demonstrate its usefulness in learning a common model for population mixtures in genetics.

LGMay 23, 2017
Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions

Aryeh Kontorovich, Sivan Sabato, Roi Weiss

We examine the Bayes-consistency of a recently proposed 1-nearest-neighbor-based multiclass learning algorithm. This algorithm is derived from sample compression bounds and enjoys the statistical advantages of tight, fully empirical generalization bounds, as well as the algorithmic advantages of a faster runtime and memory savings. We prove that this algorithm is strongly Bayes-consistent in metric spaces with finite doubling dimension --- the first consistency result for an efficient nearest-neighbor sample compression scheme. Rather surprisingly, we discover that this algorithm continues to be Bayes-consistent even in a certain infinite-dimensional setting, in which the basic measure-theoretic conditions on which classic consistency proofs hinge are violated. This is all the more surprising, since it is known that $k$-NN is not Bayes-consistent in this setting. We pose several challenging open problems for future research.

LGFeb 7, 2015
Learning Parametric-Output HMMs with Two Aliased States

Roi Weiss, Boaz Nadler

In various applications involving hidden Markov models (HMMs), some of the hidden states are aliased, having identical output distributions. The minimality, identifiability and learnability of such aliased HMMs have been long standing problems, with only partial solutions provided thus far. In this paper we focus on parametric-output HMMs, whose output distributions come from a parametric family, and that have exactly two aliased states. For this class, we present a complete characterization of their minimality and identifiability. Furthermore, for a large family of parametric output distributions, we derive computationally efficient and statistically consistent algorithms to detect the presence of aliasing and learn the aliased HMM transition and emission parameters. We illustrate our theoretical analysis by several simulations.

LGJul 1, 2014
A Bayes consistent 1-NN classifier

Aryeh Kontorovich, Roi Weiss

We show that a simple modification of the 1-nearest neighbor classifier yields a strongly Bayes consistent learner. Prior to this work, the only strongly Bayes consistent proximity-based method was the k-nearest neighbor classifier, for k growing appropriately with sample size. We will argue that a margin-regularized 1-NN enjoys considerable statistical and algorithmic advantages over the k-NN classifier. These include user-friendly finite-sample error bounds, as well as time- and memory-efficient learning and test-point evaluation algorithms with a principled speed-accuracy tradeoff. Encouraging empirical results are reported.

LGJan 30, 2014
Maximum Margin Multiclass Nearest Neighbors

Aryeh Kontorovich, Roi Weiss

We develop a general framework for margin-based multicategory classification in metric spaces. The basic work-horse is a margin-regularized version of the nearest-neighbor classifier. We prove generalization bounds that match the state of the art in sample size $n$ and significantly improve the dependence on the number of classes $k$. Our point of departure is a nearly Bayes-optimal finite-sample risk bound independent of $k$. Although $k$-free, this bound is unregularized and non-adaptive, which motivates our main result: Rademacher and scale-sensitive margin bounds with a logarithmic dependence on $k$. As the best previous risk estimates in this setting were of order $\sqrt k$, our bound is exponentially sharper. From the algorithmic standpoint, in doubling metric spaces our classifier may be trained on $n$ examples in $O(n^2\log n)$ time and evaluated on new points in $O(\log n)$ time.

LGFeb 25, 2013
On learning parametric-output HMMs

Aryeh Kontorovich, Boaz Nadler, Roi Weiss

We present a novel approach for learning an HMM whose outputs are distributed according to a parametric family. This is done by {\em decoupling} the learning task into two steps: first estimating the output parameters, and then estimating the hidden states transition probabilities. The first step is accomplished by fitting a mixture model to the output stationary distribution. Given the parameters of this mixture model, the second step is formulated as the solution of an easily solvable convex quadratic program. We provide an error analysis for the estimated transition probabilities and show they are robust to small perturbations in the estimates of the mixture parameters. Finally, we support our analysis with some encouraging empirical results.