Lee-Ad Gottlieb

12papers

332citations

Novelty62%

AI Score31

Ranked #143,970 of 201,326 authors (top 72%)#31,666 in LG (top 74%)

12 Papers

CCAug 9, 2010

Matrix sparsification and the sparse null space problem

Lee-Ad Gottlieb, Tyler Neylon

We revisit the matrix problems sparse null space and matrix sparsification, and show that they are equivalent. We then proceed to seek algorithms for these problems: We prove the hardness of approximation of these problems, and also give a powerful tool to extend algorithms and heuristics for sparse approximation theory to these problems.

LGOct 24, 2023

Weighted Distance Nearest Neighbor Condensing

Lee-Ad Gottlieb, Timor Sharabi, Roi Weiss

The problem of nearest neighbor condensing has enjoyed a long history of study, both in its theoretical and practical aspects. In this paper, we introduce the problem of weighted distance nearest neighbor condensing, where one assigns weights to each point of the condensed set, and then new points are labeled based on their weighted distance nearest neighbor in the condensed set. We study the theoretical properties of this new model, and show that it can produce dramatically better condensing than the standard nearest neighbor rule, yet is characterized by generalization bounds almost identical to the latter. We then suggest a condensing heuristic for our new problem. We demonstrate Bayes consistency for this heuristic, and also show promising empirical results.

LGMay 12, 2023

Using Deepfake Technologies for Word Emphasis Detection

Eran Kaufman, Lee-Ad Gottlieb

In this work, we consider the task of automated emphasis detection for spoken language. This problem is challenging in that emphasis is affected by the particularities of speech of the subject, for example the subject accent, dialect or voice. To address this task, we propose to utilize deep fake technology to produce an emphasis devoid speech for this speaker. This requires extracting the text of the spoken voice, and then using a voice sample from the same speaker to produce emphasis devoid speech for this task. By comparing the generated speech with the spoken voice, we are able to isolate patterns of emphasis which are relatively easy to detect.

STJul 13, 2020

Functions with average smoothness: structure, algorithms, and learning

Yair Ashlagi, Lee-Ad Gottlieb, Aryeh Kontorovich

We initiate a program of average smoothness analysis for efficiently learning real-valued functions on metric spaces. Rather than using the Lipschitz constant as the regularizer, we define a local slope at each point and gauge the function complexity as the average of these values. Since the mean can be dramatically smaller than the maximum, this complexity measure can yield considerably sharper generalization bounds -- assuming that these admit a refinement where the Lipschitz constant is replaced by our average of local slopes. Our first major contribution is to obtain just such distribution-sensitive bounds. This required overcoming a number of technical challenges, perhaps the most formidable of which was bounding the {\em empirical} covering numbers, which can be much worse-behaved than the ambient ones. Our combinatorial results are accompanied by efficient algorithms for smoothing the labels of the random sample, as well as guarantees that the extension from the sample to the whole space will continue to be, with high probability, smooth on average. Along the way we discover a surprisingly rich combinatorial and analytic structure in the function class we define.

LGFeb 5, 2020

Nested Barycentric Coordinate System as an Explicit Feature Map

Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich et al.

We propose a new embedding method which is particularly well-suited for settings where the sample size greatly exceeds the ambient dimension. Our technique consists of partitioning the space into simplices and then embedding the data points into features corresponding to the simplices' barycentric coordinates. We then train a linear classifier in the rich feature space obtained from the simplices. The decision boundary may be highly non-linear, though it is linear within each simplex (and hence piecewise-linear overall). Further, our method can approximate any convex body. We give generalization bounds based on empirical margin and a novel hybrid sample compression technique. An extensive empirical evaluation shows that our method consistently outperforms a range of popular kernel embedding methods.

LGFeb 4, 2020

Apportioned Margin Approach for Cost Sensitive Large Margin Classifiers

Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich

We consider the problem of cost sensitive multiclass classification, where we would like to increase the sensitivity of an important class at the expense of a less important one. We adopt an {\em apportioned margin} framework to address this problem, which enables an efficient margin shift between classes that share the same boundary. The decision boundary between all pairs of classes divides the margin between them in accordance to a given prioritization vector, which yields a tighter error bound for the important classes while also reducing the overall out-of-sample error. In addition to demonstrating an efficient implementation of our framework, we derive generalization bounds, demonstrate Fisher consistency, adapt the framework to Mercer's kernel and to neural networks, and report promising empirical results on all accounts.

LGSep 22, 2019

Classification in asymmetric spaces via sample compression

Lee-Ad Gottlieb, Shira Ozeri

We initiate the rigorous study of classification in quasi-metric spaces. These are point sets endowed with a distance function that is non-negative and also satisfies the triangle inequality, but is asymmetric. We develop and refine a learning algorithm for quasi-metrics based on sample compression and nearest neighbor, and prove that it has favorable statistical properties.

LGMay 24, 2018

Learning convex polyhedra with margin

Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich et al.

We present an improved algorithm for {\em quasi-properly} learning convex polyhedra in the realizable PAC setting from data with a margin. Our learning algorithm constructs a consistent polyhedron as an intersection of about $t \log t$ halfspaces with constant-size margins in time polynomial in $t$ (where $t$ is the number of halfspaces forming an optimal polyhedron). We also identify distinct generalizations of the notion of margin from hyperplanes to polyhedra and investigate how they relate geometrically; this result may have ramifications beyond the learning setting.

LGFeb 22, 2015

Nearly optimal classification for semimetrics

Lee-Ad Gottlieb, Aryeh Kontorovich

We initiate the rigorous study of classification in semimetric spaces, which are point sets with a distance function that is non-negative and symmetric, but need not satisfy the triangle inequality. For metric spaces, the doubling dimension essentially characterizes both the runtime and sample complexity of classification algorithms --- yet we show that this is not the case for semimetrics. Instead, we define the {\em density dimension} and discover that it plays a central role in the statistical and algorithmic feasibility of learning in semimetric spaces. We present nearly optimal sample compression algorithms and use these to obtain generalization guarantees, including fast rates. The latter hold for general sample compression schemes and may be of independent interest.

LGApr 13, 2014

Near-optimal sample compression for nearest neighbors

Lee-Ad Gottlieb, Aryeh Kontorovich, Pinhas Nisnevitch

We present the first sample compression algorithm for nearest neighbors with non-trivial performance guarantees. We complement these guarantees by demonstrating almost matching hardness lower bounds, which show that our bound is nearly optimal. Our result yields new insight into margin-based nearest neighbor classification in metric spaces and allows us to significantly sharpen and simplify existing bounds. Some encouraging empirical results are also presented.

LGJun 11, 2013

Efficient Classification for Metric Data

Lee-Ad Gottlieb, Aryeh Kontorovich, Robert Krauthgamer

Recent advances in large-margin classification of data residing in general metric spaces (rather than Hilbert spaces) enable classification under various natural metrics, such as string edit and earthmover distance. A general framework developed for this purpose by von Luxburg and Bousquet [JMLR, 2004] left open the questions of computational efficiency and of providing direct bounds on generalization error. We design a new algorithm for classification in general metric spaces, whose runtime and accuracy depend on the doubling dimension of the data points, and can thus achieve superior classification performance in many common scenarios. The algorithmic core of our approach is an approximate (rather than exact) solution to the classical problems of Lipschitz extension and of Nearest Neighbor Search. The algorithm's generalization performance is guaranteed via the fat-shattering dimension of Lipschitz classifiers, and we present experimental evidence of its superiority to some common kernel methods. As a by-product, we offer a new perspective on the nearest neighbor classifier, which yields significantly sharper risk asymptotics than the classic analysis of Cover and Hart [IEEE Trans. Info. Theory, 1967].

LGFeb 12, 2013

Adaptive Metric Dimensionality Reduction

Lee-Ad Gottlieb, Aryeh Kontorovich, Robert Krauthgamer

We study adaptive data-dependent dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling. On the algorithmic front, we describe an analogue of PCA for metric spaces: namely an efficient procedure that approximates the data's intrinsic dimension, which is often much lower than the ambient dimension. Our approach thus leverages the dual benefits of low dimensionality: (1) more efficient algorithms, e.g., for proximity search, and (2) more optimistic generalization bounds.