NAFeb 8, 2019
Support and Approximation Properties of Hermite SplinesJulien Fageot, Shayan Aziznejad, Michael Unser et al.
In this paper, we formally investigate two mathematical aspects of Hermite splines which translate to features that are relevant to their practical applications. We first demonstrate that Hermite splines are maximally localized in the sense that their support sizes are minimal among pairs of functions with identical reproduction properties. Then, we precisely quantify the approximation power of Hermite splines for reconstructing functions and their derivatives, and show that they are asymptotically identical to cubic B-splines for these tasks. Hermite splines therefore combine optimal localization and excellent approximation power, while retaining interpolation properties and closed-form expression, in contrast to existing similar approaches. These findings shed a new light on the convenience of Hermite splines for use in computer graphics and geometrical design.
STJun 10, 2025
Generalizing while preserving monotonicity in comparison-based preference learning modelsJulien Fageot, Peva Blanchard, Gilles Bareilles et al.
If you tell a learning model that you prefer an alternative $a$ over another alternative $b$, then you probably expect the model to be monotone, that is, the valuation of $a$ increases, and that of $b$ decreases. Yet, perhaps surprisingly, many widely deployed comparison-based preference learning models, including large language models, fail to have this guarantee. Until now, the only comparison-based preference learning algorithms that were proved to be monotone are the Generalized Bradley-Terry models. Yet, these models are unable to generalize to uncompared data. In this paper, we advance the understanding of the set of models with generalization ability that are monotone. Namely, we propose a new class of Linear Generalized Bradley-Terry models with Diffusion Priors, and identify sufficient conditions on alternatives' embeddings that guarantee monotonicity. Our experiments show that this monotonicity is far from being a general guarantee, and that our new class of generalizing models improves accuracy, especially when the dataset is limited.
STJun 10, 2025
On Monotonicity in AI AlignmentGilles Bareilles, Julien Fageot, Lê-Nguyên Hoang et al.
Comparison-based preference learning has become central to the alignment of AI models with human preferences. However, these methods may behave counterintuitively. After empirically observing that, when accounting for a preference for response $y$ over $z$, the model may actually decrease the probability (and reward) of generating $y$ (an observation also made by others), this paper investigates the root causes of (non) monotonicity, for a general comparison-based preference learning framework that subsumes Direct Preference Optimization (DPO), Generalized Preference Optimization (GPO) and Generalized Bradley-Terry (GBT). Under mild assumptions, we prove that such methods still satisfy what we call local pairwise monotonicity. We also provide a bouquet of formalizations of monotonicity, and identify sufficient conditions for their guarantee, thereby providing a toolbox to evaluate how prone learning models are to monotonicity violations. These results clarify the limitations of current methods and provide guidance for developing more trustworthy preference learning algorithms.
MLFeb 3
Byzantine Machine Learning: MultiKrum and an optimal notion of robustnessGilles Bareilles, Wassim Bouaziz, Julien Fageot et al.
Aggregation rules are the cornerstone of distributed (or federated) learning in the presence of adversaries, under the so-called Byzantine threat model. They are also interesting mathematical objects from the point of view of robust mean estimation. The Krum aggregation rule has been extensively studied, and endowed with formal robustness and convergence guarantees. Yet, MultiKrum, a natural extension of Krum, is often preferred in practice for its superior empirical performance, even though no theoretical guarantees were available until now. In this work, we provide the first proof that MultiKrum is a robust aggregation rule, and bound its robustness coefficient. To do so, we introduce $κ^\star$, the optimal *robustness coefficient* of an aggregation rule, which quantifies the accuracy of mean estimation in the presence of adversaries in a tighter manner compared with previously adopted notions of robustness. We then construct an upper and a lower bound on MultiKrum's robustness coefficient. As a by-product, we also improve on the best-known bounds on Krum's robustness coefficient. We show that MultiKrum's bounds are never worse than Krum's, and better in realistic regimes. We illustrate this analysis by an experimental investigation on the quality of the lower bound.
CVApr 28, 2020
3D Solid Spherical Bispectrum CNNs for Biomedical Texture AnalysisValentin Oreiller, Vincent Andrearczyk, Julien Fageot et al.
Locally Rotation Invariant (LRI) operators have shown great potential in biomedical texture analysis where patterns appear at random positions and orientations. LRI operators can be obtained by computing the responses to the discrete rotation of local descriptors, such as Local Binary Patterns (LBP) or the Scale Invariant Feature Transform (SIFT). Other strategies achieve this invariance using Laplacian of Gaussian or steerable wavelets for instance, preventing the introduction of sampling errors during the discretization of the rotations. In this work, we obtain LRI operators via the local projection of the image on the spherical harmonics basis, followed by the computation of the bispectrum, which shares and extends the invariance properties of the spectrum. We investigate the benefits of using the bispectrum over the spectrum in the design of a LRI layer embedded in a shallow Convolutional Neural Network (CNN) for 3D image analysis. The performance of each design is evaluated on two datasets and compared against a standard 3D CNN. The first dataset is made of 3D volumes composed of synthetically generated rotated patterns, while the second contains malignant and benign pulmonary nodules in Computed Tomography (CT) images. The results indicate that bispectrum CNNs allows for a significantly better characterization of 3D textures than both the spectral and standard CNN. In addition, it can efficiently learn with fewer training examples and trainable parameters when compared to a standard convolutional layer.
CVMar 19, 2020
Local Rotation Invariance in 3D CNNsVincent Andrearczyk, Julien Fageot, Valentin Oreiller et al.
Locally Rotation Invariant (LRI) image analysis was shown to be fundamental in many applications and in particular in medical imaging where local structures of tissues occur at arbitrary rotations. LRI constituted the cornerstone of several breakthroughs in texture analysis, including Local Binary Patterns (LBP), Maximum Response 8 (MR8) and steerable filterbanks. Whereas globally rotation invariant Convolutional Neural Networks (CNN) were recently proposed, LRI was very little investigated in the context of deep learning. LRI designs allow learning filters accounting for all orientations, which enables a drastic reduction of trainable parameters and training data when compared to standard 3D CNNs. In this paper, we propose and compare several methods to obtain LRI CNNs with directional sensitivity. Two methods use orientation channels (responses to rotated kernels), either by explicitly rotating the kernels or using steerable filters. These orientation channels constitute a locally rotation equivariant representation of the data. Local pooling across orientations yields LRI image analysis. Steerable filters are used to achieve a fine and efficient sampling of 3D rotations as well as a reduction of trainable parameters and operations, thanks to a parametric representations involving solid Spherical Harmonics (SH), which are products of SH with associated learned radial profiles.Finally, we investigate a third strategy to obtain LRI based on rotational invariants calculated from responses to a learned set of solid SHs. The proposed methods are evaluated and compared to standard CNNs on 3D datasets including synthetic textured volumes composed of rotated patterns, and pulmonary nodule classification in CT. The results show the importance of LRI image analysis while resulting in a drastic reduction of trainable parameters, outperforming standard 3D CNNs trained with data augmentation.
FAApr 24, 2019
Native Banach spaces for splines and variational inverse problemsMichael Unser, Julien Fageot
We propose a systematic construction of native Banach spaces for general spline-admissible operators ${\rm L}$. In short, the native space for ${\rm L}$ and the (dual) norm $\|\cdot\|_{\mathcal{X}'}$ is the largest space of functions $f: \mathbb{R}^d \to \mathbb{R}$ such that $\|{\rm L} f\|_{\mathcal{X}'}<\infty$, subject to the constraint that the growth-restricted null space of ${\rm L}$be finite-dimensional. This space, denoted by $\mathcal{X}'_{\rm L}$, is specified as the dual of the pre-native space $\mathcal{X}_{\rm L}$, which is itself obtained through a suitable completion process. The main difference with prior constructions (e.g., reproducing kernel Hilbert spaces) is that our approach involves test functions rather than sums of atoms (e.g, kernels), which makes it applicable to a much broader class of norms, including total variation. Under specific admissibility and compatibility hypotheses, we lay out the direct-sum topology of $\mathcal{X}_{\rm L}$ and $\mathcal{X}'_{\rm L}$, and identify the whole family of equivalent norms. Our construction ensures that the native space and its pre-dual are endowed with a fundamental Schwartz-Banach property. In practical terms, this means that $\mathcal{X}'_{\rm L}$ is rich enough to reproduce any function with an arbitrary degree of precision.