IRApr 4, 2023Code
AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content CreationJheng-Hong Yang, Carlos Lassance, Rafael Sampaio de Rezende et al. · apple-ml, cmu
This paper presents the AToMiC (Authoring Tools for Multimedia Content) dataset, designed to advance research in image/text cross-modal retrieval. While vision-language pretrained transformers have led to significant improvements in retrieval effectiveness, existing research has relied on image-caption datasets that feature only simplistic image-text relationships and underspecified user models of retrieval tasks. To address the gap between these oversimplified settings and real-world applications for multimedia content creation, we introduce a new approach for building retrieval test collections. We leverage hierarchical structures and diverse domains of texts, styles, and types of images, as well as large-scale image-document associations embedded in Wikipedia. We formulate two tasks based on a realistic user model and validate our dataset through retrieval experiments using baseline models. AToMiC offers a testbed for scalable, diverse, and reproducible multimedia retrieval research. Finally, the dataset provides the basis for a dedicated track at the 2023 Text Retrieval Conference (TREC), and is publicly available at https://github.com/TREC-AToMiC/AToMiC.
IRMay 10, 2022
From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More EffectiveThibault Formal, Carlos Lassance, Benjamin Piwowarski et al.
Neural retrievers based on dense representations combined with Approximate Nearest Neighbors search have recently received a lot of attention, owing their success to distillation and/or better sampling of examples for training -- while still relying on the same backbone architecture. In the meantime, sparse representation learning fueled by traditional inverted indexing techniques has seen a growing interest, inheriting from desirable IR priors such as explicit lexical matching. While some architectural variants have been proposed, a lesser effort has been put in the training of such models. In this work, we build on SPLADE -- a sparse expansion-based retriever -- and show to which extent it is able to benefit from the same training improvements as dense models, by studying the effect of distillation, hard-negative mining as well as the Pre-trained Language Model initialization. We furthermore study the link between effectiveness and efficiency, on in-domain and zero-shot settings, leading to state-of-the-art results in both scenarios for sufficiently expressive models.
IRJun 13, 2023
Resources for Brewing BEIR: Reproducible Reference Models and an Official LeaderboardEhsan Kamalloo, Nandan Thakur, Carlos Lassance et al.
BEIR is a benchmark dataset for zero-shot evaluation of information retrieval models across 18 different domain/task combinations. In recent years, we have witnessed the growing popularity of a representation learning approach to building retrieval models, typically using pretrained transformers in a supervised setting. This naturally begs the question: How effective are these models when presented with queries and documents that differ from the training data? Examples include searching in different domains (e.g., medical or legal text) and with different types of queries (e.g., keywords vs. well-formed questions). While BEIR was designed to answer these questions, our work addresses two shortcomings that prevent the benchmark from achieving its full potential: First, the sophistication of modern neural methods and the complexity of current software infrastructure create barriers to entry for newcomers. To this end, we provide reproducible reference implementations that cover the two main classes of approaches: learned dense and sparse models. Second, there does not exist a single authoritative nexus for reporting the effectiveness of different models on BEIR, which has led to difficulty in comparing different methods. To remedy this, we present an official self-service BEIR leaderboard that provides fair and consistent comparisons of retrieval models. By addressing both shortcomings, our work facilitates future explorations in a range of interesting research questions that BEIR enables.
IRApr 3, 2023
Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information RetrievalJimmy Lin, David Alfonso-Hermelo, Vitor Jeronymo et al.
The advent of multilingual language models has generated a resurgence of interest in cross-lingual information retrieval (CLIR), which is the task of searching documents in one language with queries from another. However, the rapid pace of progress has led to a confusing panoply of methods and reproducibility has lagged behind the state of the art. In this context, our work makes two important contributions: First, we provide a conceptual framework for organizing different approaches to cross-lingual retrieval using multi-stage architectures for mono-lingual retrieval as a scaffold. Second, we implement simple yet effective reproducible baselines in the Anserini and Pyserini IR toolkits for test collections from the TREC 2022 NeuCLIR Track, in Persian, Russian, and Chinese. Our efforts are built on a collaboration of the two teams that submitted the most effective runs to the TREC evaluation. These contributions provide a firm foundation for future advances.
IRJul 8, 2022
An Efficiency Study for SPLADE ModelsCarlos Lassance, Stéphane Clinchant
Latency and efficiency issues are often overlooked when evaluating IR models based on Pretrained Language Models (PLMs) in reason of multiple hardware and software testing scenarios. Nevertheless, efficiency is an important part of such systems and should not be overlooked. In this paper, we focus on improving the efficiency of the SPLADE model since it has achieved state-of-the-art zero-shot performance and competitive results on TREC collections. SPLADE efficiency can be controlled via a regularization factor, but solely controlling this regularization has been shown to not be efficient enough. In order to reduce the latency gap between SPLADE and traditional retrieval systems, we propose several techniques including L1 regularization for queries, a separation of document/query encoders, a FLOPS-regularized middle-training, and the use of faster query encoders. Our benchmark demonstrates that we can drastically improve the efficiency of these models while increasing the performance metrics on in-domain data. To our knowledge, {we propose the first neural models that, under the same computing constraints, \textit{achieve similar latency (less than 4ms difference) as traditional BM25}, while having \textit{similar performance (less than 10\% MRR@10 reduction)} as the state-of-the-art single-stage neural rankers on in-domain data}.
IRJan 25, 2023
An Experimental Study on Pretraining Transformers from Scratch for IRCarlos Lassance, Hervé Déjean, Stéphane Clinchant
Finetuning Pretrained Language Models (PLM) for IR has been de facto the standard practice since their breakthrough effectiveness few years ago. But, is this approach well understood? In this paper, we study the impact of the pretraining collection on the final IR effectiveness. In particular, we challenge the current hypothesis that PLM shall be trained on a large enough generic collection and we show that pretraining from scratch on the collection of interest is surprisingly competitive with the current approach. We benchmark first-stage ranking rankers and cross-encoders for reranking on the task of general passage retrieval on MSMARCO, Mr-Tydi for Arabic, Japanese and Russian, and TripClick for specific domain. Contrary to popular belief, we show that, for finetuning first-stage rankers, models pretrained solely on their collection have equivalent or better effectiveness compared to more general models. However, there is a slight effectiveness drop for rerankers pretrained only on the target collection. Overall, our study sheds a new light on the role of the pretraining collection and should make our community ponder on building specialized models by pretraining from scratch. Last but not least, doing so could enable better control of efficiency, data bias and replicability, which are key research questions for the IR community.
IRApr 14, 2022
Composite Code Sparse Autoencoders for first stage retrievalCarlos Lassance, Thibault Formal, Stephane Clinchant
We propose a Composite Code Sparse Autoencoder (CCSA) approach for Approximate Nearest Neighbor (ANN) search of document representations based on Siamese-BERT models. In Information Retrieval (IR), the ranking pipeline is generally decomposed in two stages: the first stage focus on retrieving a candidate set from the whole collection. The second stage re-ranks the candidate set by relying on more complex models. Recently, Siamese-BERT models have been used as first stage ranker to replace or complement the traditional bag-of-word models. However, indexing and searching a large document collection require efficient similarity search on dense vectors and this is why ANN techniques come into play. Since composite codes are naturally sparse, we first show how CCSA can learn efficient parallel inverted index thanks to an uniformity regularizer. Second, CCSA can be used as a binary quantization method and we propose to combine it with the recent graph based ANN techniques. Our experiments on MSMARCO dataset reveal that CCSA outperforms IVF with product quantization. Furthermore, CCSA binary quantization is beneficial for the index size, and memory usage for the graph-based HNSW method, while maintaining a good level of recall and MRR. Third, we compare with recent supervised quantization methods for image retrieval and find that CCSA is able to outperform them.
IRMar 23, 2023
Parameter-Efficient Sparse Retrievers and Rerankers using AdaptersVaishali Pal, Carlos Lassance, Hervé Déjean et al.
Parameter-Efficient transfer learning with Adapters have been studied in Natural Language Processing (NLP) as an alternative to full fine-tuning. Adapters are memory-efficient and scale well with downstream tasks by training small bottle-neck layers added between transformer layers while keeping the large pretrained language model (PLMs) frozen. In spite of showing promising results in NLP, these methods are under-explored in Information Retrieval. While previous studies have only experimented with dense retriever or in a cross lingual retrieval scenario, in this paper we aim to complete the picture on the use of adapters in IR. First, we study adapters for SPLADE, a sparse retriever, for which adapters not only retain the efficiency and effectiveness otherwise achieved by finetuning, but are memory-efficient and orders of magnitude lighter to train. We observe that Adapters-SPLADE not only optimizes just 2\% of training parameters, but outperforms fully fine-tuned counterpart and existing parameter-efficient dense IR models on IR benchmark datasets. Secondly, we address domain adaptation of neural retrieval thanks to adapters on cross-domain BEIR datasets and TripClick. Finally, we also consider knowledge sharing between rerankers and first stage rankers. Overall, our study complete the examination of adapters for neural IR
IRMar 11, 2024
SPLADE-v3: New baselines for SPLADECarlos Lassance, Hervé Déjean, Thibault Formal et al.
A companion to the release of the latest version of the SPLADE library. We describe changes to the training structure and present our latest series of models -- SPLADE-v3. We compare this new version to BM25, SPLADE++, as well as re-rankers, and showcase its effectiveness via a meta-analysis over more than 40 query sets. SPLADE-v3 further pushes the limit of SPLADE models: it is statistically significantly more effective than both BM25 and SPLADE++, while comparing well to cross-encoder re-rankers. Specifically, it gets more than 40 MRR@10 on the MS MARCO dev set, and improves by 2% the out-of-domain results on the BEIR benchmark.
IRDec 13, 2021
A Study on Token Pruning for ColBERTCarlos Lassance, Maroua Maachou, Joohee Park et al.
The ColBERT model has recently been proposed as an effective BERT based ranker. By adopting a late interaction mechanism, a major advantage of ColBERT is that document representations can be precomputed in advance. However, the big downside of the model is the index size, which scales linearly with the number of tokens in the collection. In this paper, we study various designs for ColBERT models in order to attack this problem. While compression techniques have been explored to reduce the index size, in this paper we study token pruning techniques for ColBERT. We compare simple heuristics, as well as a single layer of attention mechanism to select the tokens to keep at indexing time. Our experiments show that ColBERT indexes can be pruned up to 30\% on the MS MARCO passage collection without a significant drop in performance. Finally, we experiment on MS MARCO documents, which reveal several challenges for such mechanism.
CVOct 18, 2021
TLDR: Twin Learning for Dimensionality ReductionYannis Kalantidis, Carlos Lassance, Jon Almazan et al.
Dimensionality reduction methods are unsupervised approaches which learn low-dimensional spaces where some properties of the initial space, typically the notion of "neighborhood", are preserved. Such methods usually require propagation on large k-NN graphs or complicated optimization solvers. On the other hand, self-supervised learning approaches, typically used to learn representations from scratch, rely on simple and more scalable frameworks for learning. In this paper, we propose TLDR, a dimensionality reduction method for generic input spaces that is porting the recent self-supervised learning framework of Zbontar et al. (2021) to the specific task of dimensionality reduction, over arbitrary representations. We propose to use nearest neighbors to build pairs from a training set and a redundancy reduction loss to learn an encoder that produces representations invariant across such pairs. TLDR is a method that is simple, easy to train, and of broad applicability; it consists of an offline nearest neighbor computation step that can be highly approximated, and a straightforward learning process. Aiming for scalability, we focus on improving linear dimensionality reduction, and show consistent gains on image and document retrieval tasks, e.g. gaining +4% mAP over PCA on ROxford for GeM- AP, improving the performance of DINO on ImageNet or retaining it with a 10x compression.
LGOct 8, 2021
Graphs as Tools to Improve Deep Learning MethodsCarlos Lassance, Myriam Bontonou, Mounia Hamidouche et al.
In recent years, deep neural networks (DNNs) have known an important rise in popularity. However, although they are state-of-the-art in many machine learning challenges, they still suffer from several limitations. For example, DNNs require a lot of training data, which might not be available in some practical applications. In addition, when small perturbations are added to the inputs, DNNs are prone to misclassification errors. DNNs are also viewed as black-boxes and as such their decisions are often criticized for their lack of interpretability. In this chapter, we review recent works that aim at using graphs as tools to improve deep learning methods. These graphs are defined considering a specific layer in a deep learning architecture. Their vertices represent distinct samples, and their edges depend on the similarity of the corresponding intermediate representations. These graphs can then be leveraged using various methodologies, many of which built on top of graph signal processing. This chapter is composed of four main parts: tools for visualizing intermediate layers in a DNN, denoising data representations, optimizing graph objective functions and regularizing the learning process.
IRSep 21, 2021
SPLADE v2: Sparse Lexical and Expansion Model for Information RetrievalThibault Formal, Carlos Lassance, Benjamin Piwowarski et al.
In neural Information Retrieval (IR), ongoing research is directed towards improving the first retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using efficient approximate nearest neighbors methods has proven to work well. Meanwhile, there has been a growing interest in learning \emph{sparse} representations for documents and queries, that could inherit from the desirable properties of bag-of-words models such as the exact matching of terms and the efficiency of inverted indexes. Introduced recently, the SPLADE model provides highly sparse representations and competitive results with respect to state-of-the-art dense and sparse approaches. In this paper, we build on SPLADE and propose several significant improvements in terms of effectiveness and/or efficiency. More specifically, we modify the pooling mechanism, benchmark a model solely based on document expansion, and introduce models trained with distillation. We also report results on the BEIR benchmark. Overall, SPLADE is considerably improved with more than $9$\% gains on NDCG@10 on TREC DL 2019, leading to state-of-the-art results on the BEIR benchmark.
MLJan 12, 2021
Improving Classification Accuracy with Graph FilteringMounia Hamidouche, Carlos Lassance, Yuqing Hu et al.
In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. We show that the proposed graph filtering methodology has the effect of asymptotically reducing intra-class variance, while maintaining the mean. While our approach applies to all classification problems in general, it is particularly useful in few-shot settings, where intra-class noise can have a huge impact due to the small sample selection. Using standardized benchmarks in the field of vision, we empirically demonstrate the ability of the proposed method to slightly improve state-of-the-art results in both cases of few-shot and standard classification.
LGDec 14, 2020
Graphs for deep learning representationsCarlos Lassance
In recent years, Deep Learning methods have achieved state of the art performance in a vast range of machine learning tasks, including image classification and multilingual automatic text translation. These architectures are trained to solve machine learning tasks in an end-to-end fashion. In order to reach top-tier performance, these architectures often require a very large number of trainable parameters. There are multiple undesirable consequences, and in order to tackle these issues, it is desired to be able to open the black boxes of deep learning architectures. Problematically, doing so is difficult due to the high dimensionality of representations and the stochasticity of the training process. In this thesis, we investigate these architectures by introducing a graph formalism based on the recent advances in Graph Signal Processing (GSP). Namely, we use graphs to represent the latent spaces of deep neural networks. We showcase that this graph formalism allows us to answer various questions including: ensuring generalization abilities, reducing the amount of arbitrary choices in the design of the learning process, improving robustness to small perturbations added to the inputs, and reducing computational complexity
LGDec 2, 2020
DecisiveNets: Training Deep Associative Memories to Solve Complex Machine Learning ProblemsVincent Gripon, Carlos Lassance, Ghouthi Boukli Hacene
Learning deep representations to solve complex machine learning tasks has become the prominent trend in the past few years. Indeed, Deep Neural Networks are now the golden standard in domains as various as computer vision, natural language processing or even playing combinatorial games. However, problematic limitations are hidden behind this surprising universal capability. Among other things, explainability of the decisions is a major concern, especially since deep neural networks are made up of a very large number of trainable parameters. Moreover, computational complexity can quickly become a problem, especially in contexts constrained by real time or limited resources. Therefore, understanding how information is stored and the impact this storage can have on the system remains a major and open issue. In this chapter, we introduce a method to transform deep neural network models into deep associative memories, with simpler, more explicable and less expensive operations. We show through experiments that these transformations can be done without penalty on predictive performance. The resulting deep associative memories are excellent candidates for artificial intelligence that is easier to theorize and manipulate.
LGNov 25, 2020
Ranking Deep Learning Generalization using Label Variation in Latent Geometry GraphsCarlos Lassance, Louis Béthune, Myriam Bontonou et al.
Measuring the generalization performance of a Deep Neural Network (DNN) without relying on a validation set is a difficult task. In this work, we propose exploiting Latent Geometry Graphs (LGGs) to represent the latent spaces of trained DNN architectures. Such graphs are obtained by connecting samples that yield similar latent representations at a given layer of the considered DNN. We then obtain a generalization score by looking at how strongly connected are samples of distinct classes in LGGs. This score allowed us to rank 3rd on the NeurIPS 2020 Predicting Generalization in Deep Learning (PGDL) competition.
LGNov 14, 2020
Representing Deep Neural Networks Latent Space Geometries with GraphsCarlos Lassance, Vincent Gripon, Antonio Ortega
Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists in training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more details, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: i) Reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, ii) Designing efficient embeddings for classification is achieved by targeting specific geometries, and iii) Robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.
LGJul 16, 2020
Graph topology inference benchmarks for machine learningCarlos Lassance, Vincent Gripon, Gonzalo Mateos
Graphs are nowadays ubiquitous in the fields of signal processing and machine learning. As a tool used to express relationships between objects, graphs can be deployed to various ends: I) clustering of vertices, II) semi-supervised classification of vertices, III) supervised classification of graph signals, and IV) denoising of graph signals. However, in many practical cases graphs are not explicitly available and must therefore be inferred from data. Validation is a challenging endeavor that naturally depends on the downstream task for which the graph is learnt. Accordingly, it has often been difficult to compare the efficacy of different algorithms. In this work, we introduce several ease-to-use and publicly released benchmarks specifically designed to reveal the relative merits and limitations of graph inference methods. We also contrast some of the most prominent techniques in the literature.
LGNov 8, 2019
Deep geometric knowledge distillation with graphsCarlos Lassance, Myriam Bontonou, Ghouthi Boukli Hacene et al.
In most cases deep learning architectures are trained disregarding the amount of operations and energy consumption. However, some applications, like embedded systems, can be resource-constrained during inference. A popular approach to reduce the size of a deep learning architecture consists in distilling knowledge from a bigger network (teacher) to a smaller one (student). Directly training the student to mimic the teacher representation can be effective, but it requires that both share the same latent space dimensions. In this work, we focus instead on relative knowledge distillation (RKD), which considers the geometry of the respective latent spaces, allowing for dimension-agnostic transfer of knowledge. Specifically we introduce a graph-based RKD method, in which graphs are used to capture the geometry of latent spaces. Using classical computer vision benchmarks, we demonstrate the ability of the proposed method to efficiently distillate knowledge from the teacher to the student, leading to better accuracy for the same budget as compared to existing RKD alternatives.
LGNov 7, 2019
Improved Visual Localization via Graph SmoothingCarlos Lassance, Yasir Latif, Ravi Garg et al.
Vision based localization is the problem of inferring the pose of the camera given a single image. One solution to this problem is to learn a deep neural network to infer the pose of a query image after learning on a dataset of images with known poses. Another more commonly used approach rely on image retrieval where the query image is compared against the database of images and its pose is inferred with the help of the retrieved images. The latter approach assumes that images taken from the same places consists of the same landmarks and, thus would have similar feature representations. These representation can be learned using full supervision to be robust to different variations in capture conditions like time of the day and weather. In this work, we introduce a framework to enhance the performance of these retrieval based localization methods by taking into account the additional information including GPS coordinates and temporal neighbourhood of the images provided by the acquisition process in addition to the descriptor similarity of pairs of images in the reference or query database which is used traditionally for localization. Our method constructs a graph based on this additional information and use it for robust retrieval by smoothing the feature representation of reference and/or query images. We show that the proposed method is able to significantly improve the localization accuracy on two large scale datasets over the baselines.
LGSep 11, 2019
Structural Robustness for Deep Learning ArchitecturesCarlos Lassance, Vincent Gripon, Jian Tang et al.
Deep Networks have been shown to provide state-of-the-art performance in many machine learning challenges. Unfortunately, they are susceptible to various types of noise, including adversarial attacks and corrupted inputs. In this work we introduce a formal definition of robustness which can be viewed as a localized Lipschitz constant of the network function, quantified in the domain of the data to be classified. We compare this notion of robustness to existing ones, and study its connections with methods in the literature. We evaluate this metric by performing experiments on various competitive vision datasets.
SPAug 19, 2019
Comparing linear structure-based and data-driven latent spatial representations for sequence predictionMyriam Bontonou, Carlos Lassance, Vincent Gripon et al.
Predicting the future of Graph-supported Time Series (GTS) is a key challenge in many domains, such as climate monitoring, finance or neuroimaging. Yet it is a highly difficult problem as it requires to account jointly for time and graph (spatial) dependencies. To simplify this process, it is common to use a two-step procedure in which spatial and time dependencies are dealt with separately. In this paper, we are interested in comparing various linear spatial representations, namely structure-based ones and data-driven ones, in terms of how they help predict the future of GTS. To that end, we perform experiments with various datasets including spontaneous brain activity and raw videos.
NEMay 29, 2019
Attention Based Pruning for Shift NetworksGhouthi Boukli Hacene, Carlos Lassance, Vincent Gripon et al.
In many application domains such as computer vision, Convolutional Layers (CLs) are key to the accuracy of deep learning methods. However, it is often required to assemble a large number of CLs, each containing thousands of parameters, in order to reach state-of-the-art accuracy, thus resulting in complex and demanding systems that are poorly fitted to resource-limited devices. Recently, methods have been proposed to replace the generic convolution operator by the combination of a shift operation and a simpler 1x1 convolution. The resulting block, called Shift Layer (SL), is an efficient alternative to CLs in the sense it allows to reach similar accuracies on various tasks with faster computations and fewer parameters. In this contribution, we introduce Shift Attention Layers (SALs), which extend SLs by using an attention mechanism that learns which shifts are the best at the same time the network function is trained. We demonstrate SALs are able to outperform vanilla SLs (and CLs) on various object recognition benchmarks while significantly reducing the number of float operations and parameters for the inference.
LGMay 1, 2019
A Unified Deep Learning Formalism For Processing Graph SignalsMyriam Bontonou, Carlos Lassance, Jean-Charles Vialatte et al.
Convolutional Neural Networks are very efficient at processing signals defined on a discrete Euclidean space (such as images). However, as they can not be used on signals defined on an arbitrary graph, other models have emerged, aiming to extend its properties. We propose to review some of the major deep learning models designed to exploit the underlying graph structure of signals. We express them in a unified formalism, giving them a new and comparative reading.
LGMay 1, 2019
Introducing Graph Smoothness Loss for Training Deep Learning ArchitecturesMyriam Bontonou, Carlos Lassance, Ghouthi Boukli Hacene et al.
We introduce a novel loss function for training deep learning architectures to perform classification. It consists in minimizing the smoothness of label signals on similarity graphs built at the output of the architecture. Equivalently, it can be seen as maximizing the distances between the network function images of training inputs from distinct classes. As such, only distances between pairs of examples in distinct classes are taken into account in the process, and the training does not prevent inputs from the same class to be mapped to distant locations in the output domain. We show that this loss leads to similar performance in classification as architectures trained using the classical cross-entropy, while offering interesting degrees of freedom and properties. We also demonstrate the interest of the proposed loss to increase robustness of trained architectures to deviations of the inputs.