h-index9
39papers
605citations
Novelty56%
AI Score49

39 Papers

LGMay 20, 2022
Towards Consistency in Adversarial Classification

Laurent Meunier, Raphaël Ettedgui, Rafael Pinot et al.

In this paper, we study the problem of consistency in the context of adversarial examples. Specifically, we tackle the following question: can surrogate losses still be used as a proxy for minimizing the $0/1$ loss in the presence of an adversary that alters the inputs at test-time? Different from the standard classification task, this question cannot be reduced to a point-wise minimization problem, and calibration needs not to be sufficient to ensure consistency. In this paper, we expose some pathological behaviors specific to the adversarial problem, and show that no convex surrogate loss can be consistent or calibrated in this context. It is therefore necessary to design another class of surrogate functions that can be used to solve the adversarial consistency issue. As a first step towards designing such a class, we identify sufficient and necessary conditions for a surrogate loss to be calibrated in both the adversarial and standard settings. Finally, we give some directions for building a class of losses that could be consistent in the adversarial framework.

LGJun 3, 2022
Towards Evading the Limits of Randomized Smoothing: A Theoretical Analysis

Raphael Ettedgui, Alexandre Araujo, Rafael Pinot et al.

Randomized smoothing is the dominant standard for provable defenses against adversarial examples. Nevertheless, this method has recently been proven to suffer from important information theoretic limitations. In this paper, we argue that these limitations are not intrinsic, but merely a byproduct of current certification methods. We first show that these certificates use too little information about the classifier, and are in particular blind to the local curvature of the decision boundary. This leads to severely sub-optimal robustness guarantees as the dimension of the problem increases. We then show that it is theoretically possible to bypass this issue by collecting more information about the classifier. More precisely, we show that it is possible to approximate the optimal certificate with arbitrary precision, by probing the decision boundary with several noise distributions. Since this process is executed at certification time rather than at test time, it entails no loss in natural accuracy while enhancing the quality of the certificates. This result fosters further research on classifier-specific certification and demonstrates that randomized smoothing is still worth investigating. Although classifier-specific certification may induce more computational cost, we also provide some theoretical insight on how to mitigate it.

MLFeb 17
Generalized Leverage Score for Scalable Assessment of Privacy Vulnerability

Valentin Dorseuil, Jamal Atif, Olivier Cappé

Can the privacy vulnerability of individual data points be assessed without retraining models or explicitly simulating attacks? We answer affirmatively by showing that exposure to membership inference attack (MIA) is fundamentally governed by a data point's influence on the learned model. We formalize this in the linear setting by establishing a theoretical correspondence between individual MIA risk and the leverage score, identifying it as a principled metric for vulnerability. This characterization explains how data-dependent sensitivity translates into exposure, without the computational burden of training shadow models. Building on this, we propose a computationally efficient generalization of the leverage score for deep learning. Empirical evaluations confirm a strong correlation between the proposed score and MIA success, validating this metric as a practical surrogate for individual privacy risk assessment.

LGFeb 17
Certified Per-Instance Unlearning Using Individual Sensitivity Bounds

Hanna Benarroch, Jamal Atif, Olivier Cappé

Certified machine unlearning can be achieved via noise injection leading to differential privacy guarantees, where noise is calibrated to worst-case sensitivity. Such conservative calibration often results in performance degradation, limiting practical applicability. In this work, we investigate an alternative approach based on adaptive per-instance noise calibration tailored to the individual contribution of each data point to the learned solution. This raises the following challenge: how can one establish formal unlearning guarantees when the mechanism depends on the specific point to be removed? To define individual data point sensitivities in noisy gradient dynamics, we consider the use of per-instance differential privacy. For ridge regression trained via Langevin dynamics, we derive high-probability per-instance sensitivity bounds, yielding certified unlearning with substantially less noise injection. We corroborate our theoretical findings through experiments in linear settings and provide further empirical evidence on the relevance of the approach in deep learning settings.

LGNov 4, 2024
Optimal Classification under Performative Distribution Shift

Edwige Cyffers, Muni Sreenivas Pydi, Jamal Atif et al.

Performative learning addresses the increasingly pervasive situations in which algorithmic decisions may induce changes in the data distribution as a consequence of their public deployment. We propose a novel view in which these performative effects are modelled as push-forward measures. This general framework encompasses existing models and enables novel performative gradient estimation methods, leading to more efficient and scalable learning strategies. For distribution shifts, unlike previous models which require full specification of the data distribution, we only assume knowledge of the shift operator that represents the performative changes. This approach can also be integrated into various change-of-variablebased models, such as VAEs or normalizing flows. Focusing on classification with a linear-in-parameters performative effect, we prove the convexity of the performative risk under a new set of assumptions. Notably, we do not limit the strength of performative effects but rather their direction, requiring only that classification becomes harder when deploying more accurate models. In this case, we also establish a connection with adversarially robust classification by reformulating the minimization of the performative risk as a min-max variational problem. Finally, we illustrate our approach on synthetic and real datasets.

MLDec 13, 2023
Differentially Private Gradient Flow based on the Sliced Wasserstein Distance

Ilana Sebag, Muni Sreenivas Pydi, Jean-Yves Franceschi et al.

Safeguarding privacy in sensitive training data is paramount, particularly in the context of generative modeling. This can be achieved through either differentially private stochastic gradient descent or a differentially private metric for training models or generators. In this paper, we introduce a novel differentially private generative modeling approach based on a gradient flow in the space of probability measures. To this end, we define the gradient flow of the Gaussian-smoothed Sliced Wasserstein Distance, including the associated stochastic differential equation (SDE). By discretizing and defining a numerical scheme for solving this SDE, we demonstrate the link between smoothing and differential privacy based on a Gaussian mechanism, due to a specific form of the SDE's drift term. We then analyze the differential privacy guarantee of our gradient flow, which accounts for both the smoothing and the Wiener process introduced by the SDE itself. Experiments show that our proposed model can generate higher-fidelity data at a low privacy budget compared to a generator-based model, offering a promising alternative.

LGSep 3, 2025
On the MIA Vulnerability Gap Between Private GANs and Diffusion Models

Ilana Sebag, Jean-Yves Franceschi, Alain Rakotomamonjy et al.

Generative Adversarial Networks (GANs) and diffusion models have emerged as leading approaches for high-quality image synthesis. While both can be trained under differential privacy (DP) to protect sensitive data, their sensitivity to membership inference attacks (MIAs), a key threat to data confidentiality, remains poorly understood. In this work, we present the first unified theoretical and empirical analysis of the privacy risks faced by differentially private generative models. We begin by showing, through a stability-based analysis, that GANs exhibit fundamentally lower sensitivity to data perturbations than diffusion models, suggesting a structural advantage in resisting MIAs. We then validate this insight with a comprehensive empirical study using a standardized MIA pipeline to evaluate privacy leakage across datasets and privacy budgets. Our results consistently reveal a marked privacy robustness gap in favor of GANs, even in strong DP regimes, highlighting that model type alone can critically shape privacy leakage.

LGDec 8, 2021
Non parametric estimation of causal populations in a counterfactual scenario

Celine Beji, Florian Yger, Jamal Atif

In causality, estimating the effect of a treatment without confounding inference remains a major issue because requires to assess the outcome in both case with and without treatment. Not being able to observe simultaneously both of them, the estimation of potential outcome remains a challenging task. We propose an innovative approach where the problem is reformulated as a missing data model. The aim is to estimate the hidden distribution of \emph{causal populations}, defined as a function of treatment and outcome. A Causal Auto-Encoder (CAE), enhanced by a prior dependent on treatment and outcome information, assimilates the latent space to the probability distribution of the target populations. The features are reconstructed after being reduced to a latent space and constrained by a mask introduced in the intermediate layer of the network, containing treatment and outcome information.

IROct 28, 2021
Two-sided fairness in rankings via Lorenz dominance

Virginie Do, Sam Corbett-Davies, Jamal Atif et al.

We consider the problem of generating rankings that are fair towards both users and item producers in recommender systems. We address both usual recommendation (e.g., of music or movies) and reciprocal recommendation (e.g., dating). Following concepts of distributive justice in welfare economics, our notion of fairness aims at increasing the utility of the worse-off individuals, which we formalize using the criterion of Lorenz efficiency. It guarantees that rankings are Pareto efficient, and that they maximally redistribute utility from better-off to worse-off, at a given level of overall utility. We propose to generate rankings by maximizing concave welfare functions, and develop an efficient inference procedure based on the Frank-Wolfe algorithm. We prove that unlike existing approaches based on fairness constraints, our approach always produces fair rankings. Our experiments also show that it increases the utility of the worse-off at lower costs in terms of overall utility.

AIMay 19, 2021
Online Selection of Diverse Committees

Virginie Do, Jamal Atif, Jérôme Lang et al.

Citizens' assemblies need to represent subpopulations according to their proportions in the general population. These large committees are often constructed in an online fashion by contacting people, asking for the demographic features of the volunteers, and deciding to include them or not. This raises a trade-off between the number of people contacted (and the incurring cost) and the representativeness of the committee. We study three methods, theoretically and experimentally: a greedy algorithm that includes volunteers as long as proportionality is not violated; a non-adaptive method that includes a volunteer with a probability depending only on their features, assuming that the joint feature distribution in the volunteer pool is known; and a reinforcement learning based approach when this distribution is not known a priori but learnt online.

LGApr 29, 2021
Online certification of preference-based fairness for personalized recommender systems

Virginie Do, Sam Corbett-Davies, Jamal Atif et al.

Recommender systems are facing scrutiny because of their growing impact on the opportunities we have access to. Current audits for fairness are limited to coarse-grained parity assessments at the level of sensitive groups. We propose to audit for envy-freeness, a more granular criterion aligned with individual preferences: every user should prefer their recommendations to those of other users. Since auditing for envy requires to estimate the preferences of users beyond their existing recommendations, we cast the audit as a new pure exploration problem in multi-armed bandits. We propose a sample-efficient algorithm with theoretical guarantees that it does not deteriorate user experience. We also study the trade-offs achieved on real-world recommendation datasets.

LGFeb 22, 2021
On the robustness of randomized classifiers to adversarial examples

Rafael Pinot, Laurent Meunier, Florian Yger et al.

This paper investigates the theory of robustness against adversarial attacks. We focus on randomized classifiers (\emph{i.e.} classifiers that output random variables) and provide a thorough analysis of their behavior through the lens of statistical learning theory and information theory. To this aim, we introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness using probability metrics. Equipped with this definition, we make two new contributions. The first one consists in devising a new upper bound on the adversarial generalization gap of randomized classifiers. More precisely, we devise bounds on the generalization gap and the adversarial gap (\emph{i.e.} the gap between the risk and the worst-case risk under attack) of randomized classifiers. The second contribution presents a yet simple but efficient noise injection method to design robust randomized classifiers. We show that our results are applicable to a wide range of machine learning models under mild hypotheses. We further corroborate our findings with experimental results using deep neural networks on standard image datasets, namely CIFAR-10 and CIFAR-100. All robust models we trained models can simultaneously achieve state-of-the-art accuracy (over $0.82$ clean accuracy on CIFAR-10) and enjoy \emph{guaranteed} robust accuracy bounds ($0.45$ against $\ell_2$ adversaries with magnitude $0.5$ on CIFAR-10).

GTFeb 13, 2021
Mixed Nash Equilibria in the Adversarial Examples Game

Laurent Meunier, Meyer Scetbon, Rafael Pinot et al.

This paper tackles the problem of adversarial examples from a game theoretic point of view. We study the open question of the existence of mixed Nash equilibria in the zero-sum game formed by the attacker and the classifier. While previous works usually allow only one player to use randomized strategies, we show the necessity of considering randomization for both the classifier and the attacker. We demonstrate that this game has no duality gap, meaning that it always admits approximate Nash equilibria. We also provide the first optimization algorithms to learn a mixture of classifiers that approximately realizes the value of this game, \emph{i.e.} procedures to build an optimally robust randomized classifier.

LGSep 30, 2020
AAMDRL: Augmented Asset Management with Deep Reinforcement Learning

Eric Benhamou, David Saltiel, Sandrine Ungari et al.

Can an agent learn efficiently in a noisy and self adapting environment with sequential, non-stationary and non-homogeneous observations? Through trading bots, we illustrate how Deep Reinforcement Learning (DRL) can tackle this challenge. Our contributions are threefold: (i) the use of contextual information also referred to as augmented state in DRL, (ii) the impact of a one period lag between observations and actions that is more realistic for an asset management environment, (iii) the implementation of a new repetitive train test method called walk forward analysis, similar in spirit to cross validation for time series. Although our experiment is on trading bots, it can easily be translated to other bot environments that operate in sequential environment with regime changes and noisy data. Our experiment for an augmented asset manager interested in finding the best portfolio for hedging strategies shows that AAMDRL achieves superior returns and lower risk.

PMSep 7, 2020
Detecting and adapting to crisis pattern with context based Deep Reinforcement Learning

Eric Benhamou, David Saltiel, Jean-Jacques Ohana et al.

Deep reinforcement learning (DRL) has reached super human levels in complex tasks like game solving (Go and autonomous driving). However, it remains an open question whether DRL can reach human level in applications to financial problems and in particular in detecting pattern crisis and consequently dis-investing. In this paper, we present an innovative DRL framework consisting in two sub-networks fed respectively with portfolio strategies past performances and standard deviations as well as additional contextual features. The second sub network plays an important role as it captures dependencies with common financial indicators features like risk aversion, economic surprise index and correlations between assets that allows taking into account context based information. We compare different network architectures either using layers of convolutions to reduce network's complexity or LSTM block to capture time dependency and whether previous allocations is important in the modeling. We also use adversarial training to make the final model more robust. Results on test set show this approach substantially over-performs traditional portfolio optimization methods like Markowitz and is able to detect and anticipate crisis like the current Covid one.

LGJun 15, 2020
On Lipschitz Regularization of Convolutional Layers using Toeplitz Matrix Theory

Alexandre Araujo, Benjamin Negrevergne, Yann Chevaleyre et al.

This paper tackles the problem of Lipschitz regularization of Convolutional Neural Networks. Lipschitz regularity is now established as a key property of modern deep learning with implications in training stability, generalization, robustness against adversarial examples, etc. However, computing the exact value of the Lipschitz constant of a neural network is known to be NP-hard. Recent attempts from the literature introduce upper bounds to approximate this constant that are either efficient but loose or accurate but computationally expensive. In this work, by leveraging the theory of Toeplitz matrices, we introduce a new upper bound for convolutional layers that is both tight and easy to compute. Based on this result we devise an algorithm to train Lipschitz regularized Convolutional Neural Networks.

MLJun 12, 2020
Equitable and Optimal Transport with Multiple Agents

Meyer Scetbon, Laurent Meunier, Jamal Atif et al.

We introduce an extension of the Optimal Transport problem when multiple costs are involved. Considering each cost as an agent, we aim to share equally between agents the work of transporting one distribution to another. To do so, we minimize the transportation cost of the agent who works the most. Another point of view is when the goal is to partition equitably goods between agents according to their heterogeneous preferences. Here we aim to maximize the utility of the least advantaged agent. This is a fair division problem. Like Optimal Transport, the problem can be cast as a linear optimization problem. When there is only one agent, we recover the Optimal Transport problem. When two agents are considered, we are able to recover Integral Probability Metrics defined by $α$-Hölder functions, which include the widely-known Dudley metric. To the best of our knowledge, this is the first time a link is given between the Dudley metric and Optimal Transport. We provide an entropic regularization of that problem which leads to an alternative algorithm faster than the standard linear program.

LGApr 10, 2020
Estimating Individual Treatment Effects through Causal Populations Identification

Céline Beji, Michaël Bon, Florian Yger et al.

Estimating the Individual Treatment Effect from observational data, defined as the difference between outcomes with and without treatment or intervention, while observing just one of both, is a challenging problems in causal learning. In this paper, we formulate this problem as an inference from hidden variables and enforce causal constraints based on a model of four exclusive causal populations. We propose a new version of the EM algorithm, coined as Expected-Causality-Maximization (ECM) algorithm and provide hints on its convergence under mild conditions. We compare our algorithm to baseline methods on synthetic and real-world data and discuss its performances.

LGFeb 26, 2020
Randomization matters. How to defend against strong adversarial attacks

Rafael Pinot, Raphael Ettedgui, Geovani Rizk et al.

Is there a classifier that ensures optimal robustness against all adversarial attacks? This paper answers this question by adopting a game-theoretic point of view. We show that adversarial attacks and defenses form an infinite zero-sum game where classical results (e.g. Sion theorem) do not apply. We demonstrate the non-existence of a Nash equilibrium in our game when the classifier and the Adversary are both deterministic, hence giving a negative answer to the above question in the deterministic regime. Nonetheless, the question remains open in the randomized regime. We tackle this problem by showing that, undermild conditions on the dataset distribution, any deterministic classifier can be outperformed by a randomized one. This gives arguments for using randomization, and leads us to a new algorithm for building randomized classifiers that are robust to strong adversarial attacks. Empirical results validate our theoretical analysis, and show that our defense method considerably outperforms Adversarial Training against state-of-the-art attacks.

LGOct 5, 2019
Yet another but more efficient black-box adversarial attack: tiling and evolution strategies

Laurent Meunier, Jamal Atif, Olivier Teytaud

We introduce a new black-box attack achieving state of the art performances. Our approach is based on a new objective function, borrowing ideas from $\ell_\infty$-white box attacks, and particularly designed to fit derivative-free optimization requirements. It only requires to have access to the logits of the classifier without any other information which is a more realistic scenario. Not only we introduce a new objective function, we extend previous works on black box adversarial attacks to a larger spectrum of evolution strategies and other derivative-free optimization methods. We also highlight a new intriguing property that deep neural networks are not robust to single shot tiled attacks. Our models achieve, with a budget limited to $10,000$ queries, results up to $99.2\%$ of success rate against InceptionV3 classifier with $630$ queries to the network on average in the untargeted attacks setting, which is an improvement by $90$ queries of the current state of the art. In the targeted setting, we are able to reach, with a limited budget of $100,000$, $100\%$ of success rate with a budget of $6,662$ queries on average, i.e. we need $800$ queries less than the current state of the art.

LGJun 19, 2019
A unified view on differential privacy and robustness to adversarial examples

Rafael Pinot, Florian Yger, Cédric Gouy-Pailler et al.

This short note highlights some links between two lines of research within the emerging topic of trustworthy machine learning: differential privacy and robustness to adversarial examples. By abstracting the definitions of both notions, we show that they build upon the same theoretical ground and hence results obtained so far in one domain can be transferred to the other. More precisely, our analysis is based on two key elements: probabilistic mappings (also called randomized algorithms in the differential privacy community), and the Renyi divergence which subsumes a large family of divergences. We first generalize the definition of robustness against adversarial examples to encompass probabilistic mappings. Then we observe that Renyi-differential privacy (a generalization of differential privacy recently proposed in~\cite{Mironov2017RenyiDP}) and our definition of robustness share several similarities. We finally discuss how can both communities benefit from this connection to transfer technical tools from one research field to the other.

LGMay 14, 2019
NGO-GM: Natural Gradient Optimization for Graphical Models

Eric Benhamou, Jamal Atif, Rida Laraki et al.

This paper deals with estimating model parameters in graphical models. We reformulate it as an information geometric optimization problem and introduce a natural gradient descent strategy that incorporates additional meta parameters. We show that our approach is a strong alternative to the celebrated EM approach for learning in graphical models. Actually, our natural gradient based strategy leads to learning optimal parameters for the final objective function without artificially trying to fit a distribution that may not correspond to the real one. We support our theoretical findings with the question of trend detection in financial markets and show that the learned model performs better than traditional practitioner methods and is less prone to overfitting.

LGFeb 4, 2019
Theoretical evidence for adversarial robustness through randomization

Rafael Pinot, Laurent Meunier, Alexandre Araujo et al.

This paper investigates the theory of robustness against adversarial attacks. It focuses on the family of randomization techniques that consist in injecting noise in the network at inference time. These techniques have proven effective in many contexts, but lack theoretical arguments. We close this gap by presenting a theoretical analysis of these approaches, hence explaining why they perform well in practice. More precisely, we make two new contributions. The first one relates the randomization rate to robustness to adversarial attacks. This result applies for the general family of exponential distributions, and thus extends and unifies the previous approaches. The second contribution consists in devising a new upper bound on the adversarial generalization gap of randomized neural networks. We support our theoretical claims with a set of experiments.

LGJan 29, 2019
Understanding and Training Deep Diagonal Circulant Neural Networks

Alexandre Araujo, Benjamin Negrevergne, Yann Chevaleyre et al.

In this paper, we study deep diagonal circulant neural networks, that is deep neural networks in which weight matrices are the product of diagonal and circulant ones. Besides making a theoretical analysis of their expressivity, we introduced principled techniques for training these models: we devise an initialization scheme and proposed a smart use of non-linearity functions in order to train deep diagonal circulant networks. Furthermore, we show that these networks outperform recently introduced deep networks with other types of structured layers. We conduct a thorough experimental study to compare the performance of deep diagonal circulant networks with state of the art models based on structured matrices and with dense models. We show that our models achieve better accuracy than other structured approaches while required 2x fewer weights as the next best approach. Finally we train deep diagonal circulant networks to build a compact and accurate models on a real world video classification dataset with over 3.8 million training examples.

LGDec 27, 2018
A discrete version of CMA-ES

Eric Benhamou, Jamal Atif, Rida Laraki

Modern machine learning uses more and more advanced optimization techniques to find optimal hyper parameters. Whenever the objective function is non-convex, non continuous and with potentially multiple local minima, standard gradient descent optimization methods fail. A last resource and very different method is to assume that the optimum(s), not necessarily unique, is/are distributed according to a distribution and iteratively to adapt the distribution according to tested points. These strategies originated in the early 1960s, named Evolution Strategy (ES) have culminated with the CMA-ES (Covariance Matrix Adaptation) ES. It relies on a multi variate normal distribution and is supposed to be state of the art for general optimization program. However, it is far from being optimal for discrete variables. In this paper, we extend the method to multivariate binomial correlated distributions. For such a distribution, we show that it shares similar features to the multi variate normal: independence and correlation is equivalent and correlation is efficiently modeled by interaction between different variables. We discuss this distribution in the framework of the exponential family. We prove that the model can estimate not only pairwise interactions among the two variables but also is capable of modeling higher order interactions. This allows creating a version of CMA ES that can accommodate efficiently discrete variables. We provide the corresponding algorithm and conclude.

CVOct 2, 2018
Training compact deep learning models for video classification using circulant matrices

Alexandre Araujo, Benjamin Negrevergne, Yann Chevaleyre et al.

In real world scenarios, model accuracy is hardly the only factor to consider. Large models consume more memory and are computationally more intensive, which makes them difficult to train and to deploy, especially on mobile devices. In this paper, we build on recent results at the crossroads of Linear Algebra and Deep Learning which demonstrate how imposing a structure on large weight matrices can be used to reduce the size of the model. We propose very compact models for video classification based on state-of-the-art network architectures such as Deep Bag-of-Frames, NetVLAD and NetFisherVectors. We then conduct thorough experiments using the large YouTube-8M video classification dataset. As we will show, the circulant DBoF embedding achieves an excellent trade-off between size and accuracy.

MLMar 14, 2018
Uplift Modeling from Separate Labels

Ikko Yamane, Florian Yger, Jamal Atif et al.

Uplift modeling is aimed at estimating the incremental impact of an action on an individual's behavior, which is useful in various application domains such as targeted marketing (advertisement campaigns) and personalized medicine (medical treatments). Conventional methods of uplift modeling require every instance to be jointly equipped with two types of labels: the taken action and its outcome. However, obtaining two labels for each instance at the same time is difficult or expensive in many real-world problems. In this paper, we propose a novel method of uplift modeling that is applicable to a more practical setting where only one type of labels is available for each instance. We show a mean squared error bound for the proposed estimator and demonstrate its effectiveness through experiments.

DSMar 10, 2018
Graph-based Clustering under Differential Privacy

Rafael Pinot, Anne Morvan, Florian Yger et al.

In this paper, we present the first differentially private clustering method for arbitrary-shaped node clusters in a graph. This algorithm takes as input only an approximate Minimum Spanning Tree (MST) $\mathcal{T}$ released under weight differential privacy constraints from the graph. Then, the underlying nonconvex clustering partition is successfully recovered from cutting optimal cuts on $\mathcal{T}$. As opposed to existing methods, our algorithm is theoretically well-motivated. Experiments support our theoretical findings.

AIMar 5, 2018
Explanatory relations in arbitrary logics based on satisfaction systems, cutting and retraction

Marc Aiguier, Jamal Atif, Isabelle Bloch et al.

The aim of this paper is to introduce a new framework for defining abductive reasoning operators based on a notion of retraction in arbitrary logics defined as satisfaction systems. We show how this framework leads to the design of explanatory relations satisfying properties of abductive reasoning, and discuss its application to several logics. This extends previous work on propositional logics where retraction was defined as a morphological erosion. Here weaker properties are required for retraction, leading to a larger set of suitable operators for abduction for different logics.

LGFeb 12, 2018
On the Needs for Rotations in Hypercubic Quantization Hashing

Anne Morvan, Antoine Souloumiac, Krzysztof Choromanski et al.

The aim of this paper is to endow the well-known family of hypercubic quantization hashing methods with theoretical guarantees. In hypercubic quantization, applying a suitable (random or learned) rotation after dimensionality reduction has been experimentally shown to improve the results accuracy in the nearest neighbors search problem. We prove in this paper that the use of these rotations is optimal under some mild assumptions: getting optimal binary sketches is equivalent to applying a rotation uniformizing the diagonal of the covariance matrix between data points. Moreover, for two closed points, the probability to have dissimilar binary sketches is upper bounded by a factor of the initial distance between the data points. Relaxing these assumptions, we obtain a general concentration result for random matrices. We also provide some experiments illustrating these theoretical points and compare a set of algorithms in both the batch and online settings.

LGMay 22, 2017
Streaming Binary Sketching based on Subspace Tracking and Diagonal Uniformization

Anne Morvan, Antoine Souloumiac, Cédric Gouy-Pailler et al.

In this paper, we address the problem of learning compact similarity-preserving embeddings for massive high-dimensional streams of data in order to perform efficient similarity search. We present a new online method for computing binary compressed representations -sketches- of high-dimensional real feature vectors. Given an expected code length $c$ and high-dimensional input data points, our algorithm provides a $c$-bits binary code for preserving the distance between the points from the original high-dimensional space. Our algorithm does not require neither the storage of the whole dataset nor a chunk, thus it is fully adaptable to the streaming setting. It also provides low time complexity and convergence guarantees. We demonstrate the quality of our binary sketches through experiments on real data for the nearest neighbors search task in the online setting.

LGMar 7, 2017
Graph sketching-based Space-efficient Data Clustering

Anne Morvan, Krzysztof Choromanski, Cédric Gouy-Pailler et al.

In this paper, we address the problem of recovering arbitrary-shaped data clusters from datasets while facing \emph{high space constraints}, as this is for instance the case in many real-world applications when analysis algorithms are directly deployed on resources-limited mobile devices collecting the data. We present DBMSTClu a new space-efficient density-based \emph{non-parametric} method working on a Minimum Spanning Tree (MST) recovered from a limited number of linear measurements i.e. a \emph{sketched} version of the dissimilarity graph $\mathcal{G}$ between the $N$ objects to cluster. Unlike $k$-means, $k$-medians or $k$-medoids algorithms, it does not fail at distinguishing clusters with particular forms thanks to the property of the MST for expressing the underlying structure of a graph. No input parameter is needed contrarily to DBSCAN or the Spectral Clustering method. An approximate MST is retrieved by following the dynamic \emph{semi-streaming} model in handling the dissimilarity graph $\mathcal{G}$ as a stream of edge weight updates which is sketched in one pass over the data into a compact structure requiring $O(N \operatorname{polylog}(N))$ space, far better than the theoretical memory cost $O(N^2)$ of $\mathcal{G}$. The recovered approximate MST $\mathcal{T}$ as input, DBMSTClu then successfully detects the right number of nonconvex clusters by performing relevant cuts on $\mathcal{T}$ in a time linear in $N$. We provide theoretical guarantees on the quality of the clustering partition and also demonstrate its advantage over the existing state-of-the-art on several datasets.

LGOct 19, 2016
Structured adaptive and random spinners for fast machine learning computations

Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski et al.

We consider an efficient computational framework for speeding up several machine learning algorithms with almost no loss of accuracy. The proposed framework relies on projections via structured matrices that we call Structured Spinners, which are formed as products of three structured matrix-blocks that incorporate rotations. The approach is highly generic, i.e. i) structured matrices under consideration can either be fully-randomized or learned, ii) our structured family contains as special cases all previously considered structured schemes, iii) the setting extends to the non-linear case where the projections are followed by non-linear functions, and iv) the method finds numerous applications including kernel approximations via random feature maps, dimensionality reduction algorithms, new fast cross-polytope LSH techniques, deep learning, convex optimization algorithms via Newton sketches, quantization with random projection trees, and more. The proposed framework comes with theoretical guarantees characterizing the capacity of the structured model in reference to its unstructured counterpart and is based on a general theoretical principle that we describe in the paper. As a consequence of our theoretical analysis, we provide the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the HD3HD2HD1 structured matrix [Andoni et al., 2015]. The exhaustive experimental evaluation confirms the accuracy and efficiency of structured spinners for a variety of different applications.

DSSep 29, 2016
Multi-dimensional signal approximation with sparse structured priors using split Bregman iterations

Yoann Isaac, Quentin Barthélemy, Cédric Gouy-Pailler et al.

This paper addresses the structurally-constrained sparse decomposition of multi-dimensional signals onto overcomplete families of vectors, called dictionaries. The contribution of the paper is threefold. Firstly, a generic spatio-temporal regularization term is designed and used together with the standard $\ell_1$ regularization term to enforce a sparse decomposition preserving the spatio-temporal structure of the signal. Secondly, an optimization algorithm based on the split Bregman approach is proposed to handle the associated optimization problem, and its convergence is analyzed. Our well-founded approach yields same accuracy as the other algorithms at the state-of-the-art, with significant gains in terms of convergence speed. Thirdly, the empirical validation of the approach on artificial and real-world problems demonstrates the generality and effectiveness of the method. On artificial problems, the proposed regularization subsumes the Total Variation minimization and recovers the expected decomposition. On the real-world problem of electro-encephalography brainwave decomposition, the approach outperforms similar approaches in terms of P300 evoked potentials detection, using structured spatial priors to guide the decomposition.

LGMay 29, 2016
TripleSpin - a generic compact paradigm for fast machine learning computations

Krzysztof Choromanski, Francois Fagan, Cedric Gouy-Pailler et al.

We present a generic compact computational framework relying on structured random matrices that can be applied to speed up several machine learning algorithms with almost no loss of accuracy. The applications include new fast LSH-based algorithms, efficient kernel computations via random feature maps, convex optimization algorithms, quantization techniques and many more. Certain models of the presented paradigm are even more compressible since they apply only bit matrices. This makes them suitable for deploying on mobile devices. All our findings come with strong theoretical guarantees. In particular, as a byproduct of the presented techniques and by using relatively new Berry-Esseen-type CLT for random vectors, we give the first theoretical guarantees for one of the most efficient existing LSH algorithms based on the $\textbf{HD}_{3}\textbf{HD}_{2}\textbf{HD}_{1}$ structured matrix ("Practical and Optimal LSH for Angular Distance"). These guarantees as well as theoretical results for other aforementioned applications follow from the same general theoretical principle that we present in the paper. Our structured family contains as special cases all previously considered structured schemes, including the recently introduced $P$-model. Experimental evaluation confirms the accuracy and efficiency of TripleSpin matrices.

AIFeb 26, 2015
Relaxation-based revision operators in description logics

Marc Aiguier, Jamal Atif, Isabelle Bloch et al.

As ontologies and description logics (DLs) reach out to a broader audience, several reasoning services are developed in this context. Belief revision is one of them, of prime importance when knowledge is prone to change and inconsistency. In this paper we address both the generalization of the well-known AGM postulates, and the definition of concrete and well-founded revision operators in different DL families. We introduce a model-theoretic version of the AGM postulates with a general definition of inconsistency, hence enlarging their scope to a wide family of non-classical logics, in particular negation-free DL families. We propose a general framework for defining revision operators based on the notion of relaxation, introduced recently for defining dissimilarity measures between DL concepts. A revision operator in this framework amounts to relax the set of models of the old belief until it reaches the sets of models of the new piece of knowledge. We demonstrate that such a relaxation-based revision operator defines a faithful assignment and satisfies the generalized AGM postulates. Another important contribution concerns the definition of several concrete relaxation operators suited to the syntax of some DLs (ALC and its fragments EL and ELU).

AIFeb 8, 2015
Belief Revision, Minimal Change and Relaxation: A General Framework based on Satisfaction Systems, and Applications to Description Logics

Marc Aiguier, Jamal Atif, Isabelle Bloch et al.

Belief revision of knowledge bases represented by a set of sentences in a given logic has been extensively studied but for specific logics, mainly propositional, and also recently Horn and description logics. Here, we propose to generalize this operation from a model-theoretic point of view, by defining revision in an abstract model theory known under the name of satisfaction systems. In this framework, we generalize to any satisfaction systems the characterization of the well known AGM postulates given by Katsuno and Mendelzon for propositional logic in terms of minimal change among interpretations. Moreover, we study how to define revision, satisfying the AGM postulates, from relaxation notions that have been first introduced in description logics to define dissimilarity measures between concepts, and the consequence of which is to relax the set of models of the old belief until it becomes consistent with the new pieces of knowledge. We show how the proposed general framework can be instantiated in different logics such as propositional, first-order, description and Horn logics. In particular for description logics, we introduce several concrete relaxation operators tailored for the description logic $\ALC{}$ and its fragments $\EL{}$ and $\ELext{}$, discuss their properties and provide some illustrative examples.

DSMar 21, 2013
Multi-dimensional sparse structured signal approximation using split Bregman iterations

Yoann Isaac, Quentin Barthélemy, Jamal Atif et al.

The paper focuses on the sparse approximation of signals using overcomplete representations, such that it preserves the (prior) structure of multi-dimensional signals. The underlying optimization problem is tackled using a multi-dimensional split Bregman optimization approach. An extensive empirical evaluation shows how the proposed approach compares to the state of the art depending on the signal features.

LGFeb 18, 2013
Metrics for Multivariate Dictionaries

Sylvain Chevallier, Quentin Barthélemy, Jamal Atif

Overcomplete representations and dictionary learning algorithms kept attracting a growing interest in the machine learning community. This paper addresses the emerging problem of comparing multivariate overcomplete representations. Despite a recurrent need to rely on a distance for learning or assessing multivariate overcomplete representations, no metrics in their underlying spaces have yet been proposed. Henceforth we propose to study overcomplete representations from the perspective of frame theory and matrix manifolds. We consider distances between multivariate dictionaries as distances between their spans which reveal to be elements of a Grassmannian manifold. We introduce Wasserstein-like set-metrics defined on Grassmannian spaces and study their properties both theoretically and numerically. Indeed a deep experimental study based on tailored synthetic datasetsand real EEG signals for Brain-Computer Interfaces (BCI) have been conducted. In particular, the introduced metrics have been embedded in clustering algorithm and applied to BCI Competition IV-2a for dataset quality assessment. Besides, a principled connection is made between three close but still disjoint research fields, namely, Grassmannian packing, dictionary learning and compressed sensing.