MLOct 10, 2022
Scale Equivariant U-NetMateus Sangalli, Samy Blusseau, Santiago Velasco-Forero et al.
In neural networks, the property of being equivariant to transformations improves generalization when the corresponding symmetry is present in the data. In particular, scale-equivariant networks are suited to computer vision tasks where the same classes of objects appear at different scales, like in most semantic segmentation tasks. Recently, convolutional layers equivariant to a semigroup of scalings and translations have been proposed. However, the equivariance of subsampling and upsampling has never been explicitly studied even though they are necessary building blocks in some segmentation architectures. The U-Net is a representative example of such architectures, which includes the basic elements used for state-of-the-art semantic segmentation. Therefore, this paper introduces the Scale Equivariant U-Net (SEU-Net), a U-Net that is made approximately equivariant to a semigroup of scales and translations through careful application of subsampling and upsampling layers and the use of aforementioned scale-equivariant layers. Moreover, a scale-dropout is proposed in order to improve generalization to different scales in approximately scale-equivariant architectures. The proposed SEU-Net is trained for semantic segmentation of the Oxford Pet IIIT and the DIC-C2DH-HeLa dataset for cell segmentation. The generalization metric to unseen scales is dramatically improved in comparison to the U-Net, even when the U-Net is trained with scale jittering, and to a scale-equivariant architecture that does not perform upsampling operators inside the equivariant pipeline. The scale-dropout induces better generalization on the scale-equivariant models in the Pet experiment, but not on the cell segmentation experiment.
CVNov 7, 2022
Moving Frame Net: SE(3)-Equivariant Network for VolumesMateus Sangalli, Samy Blusseau, Santiago Velasco-Forero et al.
Equivariance of neural networks to transformations helps to improve their performance and reduce generalization error in computer vision tasks, as they apply to datasets presenting symmetries (e.g. scalings, rotations, translations). The method of moving frames is classical for deriving operators invariant to the action of a Lie group in a manifold.Recently, a rotation and translation equivariant neural network for image data was proposed based on the moving frames approach. In this paper we significantly improve that approach by reducing the computation of moving frames to only one, at the input stage, instead of repeated computations at each layer. The equivariance of the resulting architecture is proved theoretically and we build a rotation and translation equivariant neural network to process volumes, i.e. signals on the 3D space. Our trained model overperforms the benchmarks in the medical volume classification of most of the tested datasets from MedMNIST3D.
SPMay 12, 2022
Near out-of-distribution detection for low-resolution radar micro-Doppler signaturesMartin Bauw, Santiago Velasco-Forero, Jesus Angulo et al.
Near out-of-distribution detection (OODD) aims at discriminating semantically similar data points without the supervision required for classification. This paper puts forward an OODD use case for radar targets detection extensible to other kinds of sensors and detection scenarios. We emphasize the relevance of OODD and its specific supervision requirements for the detection of a multimodal, diverse targets class among other similar radar targets and clutter in real-life critical systems. We propose a comparison of deep and non-deep OODD methods on simulated low-resolution pulse radar micro-Doppler signatures, considering both a spectral and a covariance matrix input representation. The covariance representation aims at estimating whether dedicated second-order processing is appropriate to discriminate signatures. The potential contributions of labeled anomalies in training, self-supervised learning, contrastive learning insights and innovative training losses are discussed, and the impact of training set contamination caused by mislabelling is investigated.
LGJul 13, 2022
MorphoActivation: Generalizing ReLU activation function by mathematical morphologySantiago Velasco-Forero, Jesús Angulo
This paper analyses both nonlinear activation functions and spatial max-pooling for Deep Convolutional Neural Networks (DCNNs) by means of the algebraic basis of mathematical morphology. Additionally, a general family of activation functions is proposed by considering both max-pooling and nonlinear operators in the context of morphological representations. Experimental section validates the goodness of our approach on classical benchmarks for supervised learning by DCNN.
CVJan 23, 2023
Adapting the Hypersphere Loss Function from Anomaly Detection to Anomaly SegmentationJoao P. C. Bertoldo, Santiago Velasco-Forero, Jesus Angulo et al.
We propose an incremental improvement to Fully Convolutional Data Description (FCDD), an adaptation of the one-class classification approach from anomaly detection to image anomaly segmentation (a.k.a. anomaly localization). We analyze its original loss function and propose a substitute that better resembles its predecessor, the Hypersphere Classifier (HSC). Both are compared on the MVTec Anomaly Detection Dataset (MVTec-AD) -- training images are flawless objects/textures and the goal is to segment unseen defects -- showing that consistent improvement is achieved by better designing the pixel-wise supervision.
CVFeb 2Code
LmPT: Conditional Point Transformer for Anatomical Landmark Detection on 3D Point CloudsMatteo Bastico, Pierre Onghena, David Ryckelynck et al.
Accurate identification of anatomical landmarks is crucial for various medical applications. Traditional manual landmarking is time-consuming and prone to inter-observer variability, while rule-based methods are often tailored to specific geometries or limited sets of landmarks. In recent years, anatomical surfaces have been effectively represented as point clouds, which are lightweight structures composed of spatial coordinates. Following this strategy and to overcome the limitations of existing landmarking techniques, we propose Landmark Point Transformer (LmPT), a method for automatic anatomical landmark detection on point clouds that can leverage homologous bones from different species for translational research. The LmPT model incorporates a conditioning mechanism that enables adaptability to different input types to conduct cross-species learning. We focus the evaluation of our approach on femoral landmarking using both human and newly annotated dog femurs, demonstrating its generalization and effectiveness across species. The code and dog femur dataset will be publicly available at: https://github.com/Pierreoo/LandmarkPointTransformer.
CVNov 22, 2021Code
Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D MappingJean-Emmanuel Deschaud, David Duque, Jean Pierre Richa et al.
Paris-CARLA-3D is a dataset of several dense colored point clouds of outdoor environments built by a mobile LiDAR and camera system. The data are composed of two sets with synthetic data from the open source CARLA simulator (700 million points) and real data acquired in the city of Paris (60 million points), hence the name Paris-CARLA-3D. One of the advantages of this dataset is to have simulated the same LiDAR and camera platform in the open source CARLA simulator as the one used to produce the real data. In addition, manual annotation of the classes using the semantic tags of CARLA was performed on the real data, allowing the testing of transfer methods from the synthetic to the real data. The objective of this dataset is to provide a challenging dataset to evaluate and improve methods on difficult vision tasks for the 3D mapping of outdoor environments: semantic segmentation, instance segmentation, and scene completion. For each task, we describe the evaluation protocol as well as the experiments carried out to establish a baseline.
CVJan 22, 2025
MorphoSkel3D: Morphological Skeletonization of 3D Point Clouds for Informed Sampling in Object Classification and RetrievalPierre Onghena, Santiago Velasco-Forero, Beatriz Marcotegui
Point clouds are a set of data points in space to represent the 3D geometry of objects. A fundamental step in the processing is to identify a subset of points to represent the shape. While traditional sampling methods often ignore to incorporate geometrical information, recent developments in learning-based sampling models have achieved significant levels of performance. With the integration of geometrical priors, the ability to learn and preserve the underlying structure can be enhanced when sampling. To shed light into the shape, a qualitative skeleton serves as an effective descriptor to guide sampling for both local and global geometries. In this paper, we introduce MorphoSkel3D as a new technique based on morphology to facilitate an efficient skeletonization of shapes. With its low computational cost, MorphoSkel3D is a unique, rule-based algorithm to benchmark its quality and performance on two large datasets, ModelNet and ShapeNet, under different sampling ratios. The results show that training with MorphoSkel3D leads to an informed and more accurate sampling in the practical application of object classification and point cloud retrieval.
CVSep 8, 2025
Approximating Condorcet Ordering for Vector-valued Mathematical MorphologyMarcos Eduardo Valle, Santiago Velasco-Forero, Joao Batista Florindo et al.
Mathematical morphology provides a nonlinear framework for image and spatial data processing and analysis. Although there have been many successful applications of mathematical morphology to vector-valued images, such as color and hyperspectral images, there is still no consensus on the most suitable vector ordering for constructing morphological operators. This paper addresses this issue by examining a reduced ordering approximating the Condorcet ranking derived from a set of vector orderings. Inspired by voting problems, the Condorcet ordering ranks elements from most to least voted, with voters representing different orderings. In this paper, we develop a machine learning approach that learns a reduced ordering that approximates the Condorcet ordering. Preliminary computational experiments confirm the effectiveness of learning the reduced mapping to define vector-valued morphological operators for color images.
SPJun 8, 2021
Deep Random Projection Outlyingness for Unsupervised Anomaly DetectionMartin Bauw, Santiago Velasco-Forero, Jesus Angulo et al.
Random projection is a common technique for designing algorithms in a variety of areas, including information retrieval, compressive sensing and measuring of outlyingness. In this work, the original random projection outlyingness measure is modified and associated with a neural network to obtain an unsupervised anomaly detection method able to handle multimodal normality. Theoretical and experimental arguments are presented to justify the choice of the anomaly score estimator. The performance of the proposed neural network approach is comparable to a state-of-the-art anomaly detection method. Experiments conducted on the MNIST, Fashion-MNIST and CIFAR-10 datasets show the relevance of the proposed approach.
CVMay 27, 2020
Road Segmentation on low resolution Lidar point clouds for autonomous vehiclesLeonardo Gigli, B Ravi Kiran, Thomas Paul et al.
Point cloud datasets for perception tasks in the context of autonomous driving often rely on high resolution 64-layer Light Detection and Ranging (LIDAR) scanners. They are expensive to deploy on real-world autonomous driving sensor architectures which usually employ 16/32 layer LIDARs. We evaluate the effect of subsampling image based representations of dense point clouds on the accuracy of the road segmentation task. In our experiments the low resolution 16/32 layer LIDAR point clouds are simulated by subsampling the original 64 layer data, for subsequent transformation in to a feature map in the Bird-Eye-View (BEV) and SphericalView (SV) representations of the point cloud. We introduce the usage of the local normal vector with the LIDAR's spherical coordinates as an input channel to existing LoDNN architectures. We demonstrate that this local normal feature in conjunction with classical features not only improves performance for binary road segmentation on full resolution point clouds, but it also reduces the negative impact on the accuracy when subsampling dense point clouds as compared to the usage of classical features alone. We assess our method with several experiments on two datasets: KITTI Road-segmentation benchmark and the recently released Semantic KITTI dataset.
CVJun 27, 2019
Dealing with Topological Information within a Fully Convolutional Neural NetworkEtienne Decencière, Santiago Velasco-Forero, Fu Min et al.
A fully convolutional neural network has a receptive field of limited size and therefore cannot exploit global information, such as topological information. A solution is proposed in this paper to solve this problem, based on pre-processing with a geodesic operator. It is applied to the segmentation of histological images of pigmented reconstructed epidermis acquired via Whole Slide Imaging.
CVMar 20, 2019
Part-based approximations for morphological operators using asymmetric auto-encodersBastien Ponchon, Santiago Velasco-Forero, Samy Blusseau et al.
This paper addresses the issue of building a part-based representation of a dataset of images. More precisely, we look for a non-negative, sparse decomposition of the images on a reduced set of atoms, in order to unveil a morphological and interpretable structure of the data. Additionally, we want this decomposition to be computed online for any new sample that is not part of the initial dataset. Therefore, our solution relies on a sparse, non-negative auto-encoder where the encoder is deep (for accuracy) and the decoder shallow (for interpretability). This method compares favorably to the state-of-the-art online methods on two datasets (MNIST and Fashion MNIST), according to classical metrics and to a new one we introduce, based on the invariance of the representation to morphological dilation.
STMar 19, 2019
Max-plus Operators Applied to Filter Selection and Model Pruning in Neural NetworksYunxiang Zhang, Samy Blusseau, Santiago Velasco-Forero et al.
Following recent advances in morphological neural networks, we propose to study in more depth how Max-plus operators can be exploited to define morphological units and how they behave when incorporated in layers of conventional neural networks. Besides showing that they can be easily implemented with modern machine learning frameworks , we confirm and extend the observation that a Max-plus layer can be used to select important filters and reduce redundancy in its previous layer, without incurring performance loss. Experimental results demonstrate that the filter selection strategy enabled by a Max-plus is highly efficient and robust, through which we successfully performed model pruning on different neural network architectures. We also point out that there is a close connection between Maxout networks and our pruned Max-plus networks by comparing their respective characteristics. The code for reproducing our experiments is available online.
MLFeb 20, 2018
Segmentation hiérarchique faiblement superviséeAmin Fehri, Santiago Velasco-Forero, Fernand Meyer
Image segmentation is the process of partitioning an image into a set of meaningful regions according to some criteria. Hierarchical segmentation has emerged as a major trend in this regard as it favors the emergence of important regions at different scales. On the other hand, many methods allow us to have prior information on the position of structures of interest in the images. In this paper, we present a versatile hierarchical segmentation method that takes into account any prior spatial information and outputs a hierarchical segmentation that emphasizes the contours or regions of interest while preserving the important structures in the image. An application of this method to the weakly-supervised segmentation problem is presented.
CVMar 9, 2017
Prior-based Hierarchical Segmentation Highlighting Structures of InterestAmin Fehri, Santiago Velasco-Forero, Fernand Meyer
Image segmentation is the process of partitioning an image into a set of meaningful regions according to some criteria. Hierarchical segmentation has emerged as a major trend in this regard as it favors the emergence of important regions at different scales. On the other hand, many methods allow us to have prior information on the position of structures of interest in the images. In this paper, we present a versatile hierarchical segmentation method that takes into account any prior spatial information and outputs a hierarchical segmentation that emphasizes the contours or regions of interest while preserving the important structures in the image. Several applications are presented that illustrate the method versatility and efficiency.
CVSep 9, 2016
Automatic Selection of Stochastic Watershed HierarchiesAmin Fehri, Santiago Velasco-Forero, Fernand Meyer
The segmentation, seen as the association of a partition with an image, is a difficult task. It can be decomposed in two steps: at first, a family of contours associated with a series of nested partitions (or hierarchy) is created and organized, then pertinent contours are extracted. A coarser partition is obtained by merging adjacent regions of a finer partition. The strength of a contour is then measured by the level of the hierarchy for which its two adjacent regions merge. We present an automatic segmentation strategy using a wide range of stochastic watershed hierarchies. For a given set of homogeneous images, our approach selects automatically the best hierarchy and cut level to perform image simplification given an evaluation score. Experimental results illustrate the advantages of our approach on several real-life images datasets.