CVAug 30, 2023Code
MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer VisionJianning Li, Zongwei Zhou, Jiancheng Yang et al.
Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedback
CVOct 12, 2022
Teeth3DS+: An Extended Benchmark for Intraoral 3D Scans AnalysisAchraf Ben-Hamadou, Nour Neifar, Ahmed Rekik et al.
Intraoral 3D scanning is now widely adopted in modern dentistry and plays a central role in supporting key tasks such as tooth segmentation, detection, labeling, and dental landmark identification. Accurate analysis of these scans is essential for orthodontic and restorative treatment planning, as it enables automated workflows and minimizes the need for manual intervention. However, the development of robust learning-based solutions remains challenging due to the limited availability of high-quality public datasets and standardized benchmarks. This article presents Teeth3DS+, an extended public benchmark dedicated to intraoral 3D scan analysis. Developed in the context of the MICCAI 3DTeethSeg and 3DTeethLand challenges, Teeth3DS+ supports multiple fundamental tasks, including tooth detection, segmentation, labeling, 3D modeling, and dental landmark identification. The dataset consists of rigorously curated intraoral scans acquired using state-of-the-art scanners and validated by experienced orthodontists and dental surgeons. In addition to the data, Teeth3DS+ provides standardized data splits and evaluation protocols to enable fair and reproducible comparison of methods, with the goal of fostering progress in learning-based analysis of 3D dental scans. Detailed instructions for accessing the dataset are available at https://crns-smartvision.github.io/teeth3ds
SPOct 22, 2022
Leveraging Statistical Shape Priors in GAN-based ECG SynthesisNour Neifar, Achraf Ben-Hamadou, Afef Mdhaffar et al.
Electrocardiogram (ECG) data collection during emergency situations is challenging, making ECG data generation an efficient solution for dealing with highly imbalanced ECG training datasets. In this paper, we propose a novel approach for ECG signal generation using Generative Adversarial Networks (GANs) and statistical ECG data modeling. Our approach leverages prior knowledge about ECG dynamics to synthesize realistic signals, addressing the complex dynamics of ECG signals. To validate our approach, we conducted experiments using ECG signals from the MIT-BIH arrhythmia database. Our results demonstrate that our approach, which models temporal and amplitude variations of ECG signals as 2-D shapes, generates more realistic signals compared to state-of-the-art GAN based generation baselines. Our proposed approach has significant implications for improving the quality of ECG training datasets, which can ultimately lead to better performance of ECG classification algorithms. This research contributes to the development of more efficient and accurate methods for ECG analysis, which can aid in the diagnosis and treatment of cardiac diseases.
CVJun 19, 2023Code
Graph Self-Supervised Learning for Endoscopic Image MatchingManel Farhat, Achraf Ben-Hamadou
Accurate feature matching and correspondence in endoscopic images play a crucial role in various clinical applications, including patient follow-up and rapid anomaly localization through panoramic image generation. However, developing robust and accurate feature matching techniques faces challenges due to the lack of discriminative texture and significant variability between patients. To address these limitations, we propose a novel self-supervised approach that combines Convolutional Neural Networks for capturing local visual appearance and attention-based Graph Neural Networks for modeling spatial relationships between key-points. Our approach is trained in a fully self-supervised scheme without the need for labeled data. Our approach outperforms state-of-the-art handcrafted and deep learning-based methods, demonstrating exceptional performance in terms of precision rate (1) and matching score (99.3%). We also provide code and materials related to this work, which can be accessed at https://github.com/abenhamadou/graph-self-supervised-learning-for-endoscopic-image-matching.
CVJun 2, 2023
DiffECG: A Versatile Probabilistic Diffusion Model for ECG Signals SynthesisNour Neifar, Achraf Ben-Hamadou, Afef Mdhaffar et al.
Within cardiovascular disease detection using deep learning applied to ECG signals, the complexities of handling physiological signals have sparked growing interest in leveraging deep generative models for effective data augmentation. In this paper, we introduce a novel versatile approach based on denoising diffusion probabilistic models for ECG synthesis, addressing three scenarios: (i) heartbeat generation, (ii) partial signal imputation, and (iii) full heartbeat forecasting. Our approach presents the first generalized conditional approach for ECG synthesis, and our experimental results demonstrate its effectiveness for various ECG-related tasks. Moreover, we show that our approach outperforms other state-of-the-art ECG generative models and can enhance the performance of state-of-the-art classifiers.
LGJul 12, 2023
Deep Generative Models for Physiological Signals: A Systematic Literature ReviewNour Neifar, Afef Mdhaffar, Achraf Ben-Hamadou et al.
In this paper, we present a systematic literature review on deep generative models for physiological signals, particularly electrocardiogram (ECG), electroencephalogram (EEG), photoplethysmogram (PPG) and electromyogram (EMG). Compared to the existing review papers, we present the first review that summarizes the recent state-of-the-art deep generative models. By analyzing the state-of-the-art research related to deep generative models along with their main applications and challenges, this review contributes to the overall understanding of these models applied to physiological signals. Additionally, by highlighting the employed evaluation protocol and the most used physiological databases, this review facilitates the assessment and benchmarking of deep generative models.
CVAug 24, 2022
Self-Supervised Endoscopic Image Key-Points MatchingManel Farhat, Houda Chaabouni-Chouayakh, Achraf Ben-Hamadou
Feature matching and finding correspondences between endoscopic images is a key step in many clinical applications such as patient follow-up and generation of panoramic image from clinical sequences for fast anomalies localization. Nonetheless, due to the high texture variability present in endoscopic images, the development of robust and accurate feature matching becomes a challenging task. Recently, deep learning techniques which deliver learned features extracted via convolutional neural networks (CNNs) have gained traction in a wide range of computer vision tasks. However, they all follow a supervised learning scheme where a large amount of annotated data is required to reach good performances, which is generally not always available for medical data databases. To overcome this limitation related to labeled data scarcity, the self-supervised learning paradigm has recently shown great success in a number of applications. This paper proposes a novel self-supervised approach for endoscopic image matching based on deep learning techniques. When compared to standard hand-crafted local feature descriptors, our method outperformed them in terms of precision and recall. Furthermore, our self-supervised descriptor provides a competitive performance in comparison to a selection of state-of-the-art deep learning based supervised methods in terms of precision and matching score.
CVDec 9, 2025
Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challengeAchraf Ben-Hamadou, Nour Neifar, Ahmed Rekik et al.
Teeth landmark detection is a critical task in modern clinical orthodontics. Their precise identification enables advanced diagnostics, facilitates personalized treatment strategies, and supports more effective monitoring of treatment progress in clinical dentistry. However, several significant challenges may arise due to the intricate geometry of individual teeth and the substantial variations observed across different individuals. To address these complexities, the development of advanced techniques, especially through the application of deep learning, is essential for the precise and reliable detection of 3D tooth landmarks. In this context, the 3DTeethLand challenge was held in collaboration with the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) in 2024, calling for algorithms focused on teeth landmark detection from intraoral 3D scans. This challenge introduced the first publicly available dataset for 3D teeth landmark detection, offering a valuable resource to assess the state-of-the-art methods in this task and encourage the community to provide methodological contributions towards the resolution of their problem with significant clinical implications.
CVMay 29, 2023Code
3DTeethSeg'22: 3D Teeth Scan Segmentation and Labeling ChallengeAchraf Ben-Hamadou, Oussama Smaoui, Ahmed Rekik et al.
Teeth localization, segmentation, and labeling from intra-oral 3D scans are essential tasks in modern dentistry to enhance dental diagnostics, treatment planning, and population-based studies on oral health. However, developing automated algorithms for teeth analysis presents significant challenges due to variations in dental anatomy, imaging protocols, and limited availability of publicly accessible data. To address these challenges, the 3DTeethSeg'22 challenge was organized in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2022, with a call for algorithms tackling teeth localization, segmentation, and labeling from intraoral 3D scans. A dataset comprising a total of 1800 scans from 900 patients was prepared, and each tooth was individually annotated by a human-machine hybrid algorithm. A total of 6 algorithms were evaluated on this dataset. In this study, we present the evaluation results of the 3DTeethSeg'22 challenge. The 3DTeethSeg'22 challenge code can be accessed at: https://github.com/abenhamadou/3DTeethSeg22_challenge
CVFeb 18, 2024
Cross-Attention Fusion of Visual and Geometric Features for Large Vocabulary Arabic LipreadingSamar Daou, Achraf Ben-Hamadou, Ahmed Rekik et al.
Lipreading involves using visual data to recognize spoken words by analyzing the movements of the lips and surrounding area. It is a hot research topic with many potential applications, such as human-machine interaction and enhancing audio speech recognition. Recent deep-learning based works aim to integrate visual features extracted from the mouth region with landmark points on the lip contours. However, employing a simple combination method such as concatenation may not be the most effective approach to get the optimal feature vector. To address this challenge, firstly, we propose a cross-attention fusion-based approach for large lexicon Arabic vocabulary to predict spoken words in videos. Our method leverages the power of cross-attention networks to efficiently integrate visual and geometric features computed on the mouth region. Secondly, we introduce the first large-scale Lipreading in the Wild for Arabic (LRW-AR) dataset containing 20,000 videos for 100-word classes, uttered by 36 speakers. The experimental results obtained on LRW-AR and ArabicVisual databases showed the effectiveness and robustness of the proposed approach in recognizing Arabic words. Our work provides insights into the feasibility and effectiveness of applying lipreading techniques to the Arabic language, opening doors for further research in this field. Link to the project page: https://crns-smartvision.github.io/lrwar
CVDec 17, 2018
Discriminant Patch Representation for RGB-D Face Recognition Using Convolutional Neural NetworksNesrine Grati, Achraf Ben-Hamadou, Mohamed Hammami
This paper focuses on designing data-driven models to learn a discriminant representation space for face recognition using RGB-D data. Unlike hand-crafted representations, learned models can extract and organize the discriminant information from the data, and can automatically adapt to build new compute vision applications faster. We proposed an effective way to train Convolutional Neural Networks to learn face patch discriminant features. The proposed solution was tested and validated on state-of-the-art RGB-D datasets and showed competitive and promising results relatively to standard hand-crafted feature extractors.
CVJul 16, 2016
Construction of extended 3D field of views of the internal bladder wall surface: a proof of conceptAchraf Ben-Hamadou, Christian Daul, Charles Soussen
3D extended field of views (FOVs) of the internal bladder wall facilitate lesion diagnosis, patient follow-up and treatment traceability. In this paper, we propose a 3D image mosaicing algorithm guided by 2D cystoscopic video-image registration for obtaining textured FOV mosaics. In this feasibility study, the registration makes use of data from a 3D cystoscope prototype providing, in addition to each small FOV image, some 3D points located on the surface. This proof of concept shows that textured surfaces can be constructed with minimally modified cystoscopes. The potential of the method is demonstrated on numerical and real phantoms reproducing various surface shapes. Pig and human bladder textures are superimposed on phantoms with known shape and dimensions. These data allow for quantitative assessment of the 3D mosaicing algorithm based on the registration of images simulating bladder textures.
CVApr 29, 2015
Comparative study of image registration techniques for bladder video-endoscopyAchraf Ben-Hamadou, Charles Soussen, Walter Blondel et al.
Bladder cancer is widely spread in the world. Many adequate diagnosis techniques exist. Video-endoscopy remains the standard clinical procedure for visual exploration of the bladder internal surface. However, video-endoscopy presents the limit that the imaged area for each image is about nearly 1cm2. And, lesions are, typically, spread over several images. The aim of this contribution is to assess the performance of two mosaicing algorithms leading to the construction of panoramic maps (one unique image) of bladder walls. The quantitative comparison study is performed on a set of real endoscopic exam data and on simulated data relative to bladder phantom.