CVMay 29Code
CoFiDA-M: Concept-Aware Feature Modulation for Cross-Domain Adaptation with Image-Only InferenceNurjahan Sultana, Moi Hoon Yap, Xinqi Fan et al.
Models for AI-based skin cancer screening suffer a severe performance drop when shifting from expert dermoscopic (source) images to consumer-grade clinical (target) images, hindering real-world deployment. Existing domain adaptation methods often ignore crucial semantic invariants, such as clinical concepts. While new foundation models like MONET can provide this semantic information as dense, probabilistic scores, this metadata is unavailable at test time, creating a deployment paradox for practical image-only screening tools. We address this gap by proposing CoFiDA-M, a privileged information framework that learns from concepts at training time but deploys as an image-only model. Our method trains a teacher network that uses MONET concept probabilities to guide a FiLM modulator, transforming visual features into a semantically ``edited" feature space. A lightweight, image-only student is then trained to reproduce this edited representation, not just the teacher's final predictions. This distillation ``bakes" the clinical reasoning into the student's weights. On a challenging multi-dataset benchmark, our image-only student significantly outperforms state-of-the-art approaches, especially in melanoma recall. Our work provides a practical and generalizable framework for leveraging noisy, probabilistic metadata as privileged information, demonstrating strong cross-dataset robustness and potential for real-world deployment beyond dermatology. Implementation code is available at: https://github.com/mmu-dermatology-research/CoFiDA.git
CVDec 16, 2022
Biomedical image analysis competitions: The state of current participation practiceMatthias Eisenmann, Annika Reinke, Vivienn Weru et al. · utoronto
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
CVApr 25, 2023Code
Quantifying the Effect of Image Similarity on Diabetic Foot Ulcer ClassificationImran Chowdhury Dipto, Bill Cassidy, Connah Kendrick et al.
This research conducts an investigation on the effect of visually similar images within a publicly available diabetic foot ulcer dataset when training deep learning classification networks. The presence of binary-identical duplicate images in datasets used to train deep learning algorithms is a well known issue that can introduce unwanted bias which can degrade network performance. However, the effect of visually similar non-identical images is an under-researched topic, and has so far not been investigated in any diabetic foot ulcer studies. We use an open-source fuzzy algorithm to identify groups of increasingly similar images in the Diabetic Foot Ulcers Challenge 2021 (DFUC2021) training dataset. Based on each similarity threshold, we create new training sets that we use to train a range of deep learning multi-class classifiers. We then evaluate the performance of the best performing model on the DFUC2021 test set. Our findings show that the model trained on the training set with the 80\% similarity threshold images removed achieved the best performance using the InceptionResNetV2 network. This model showed improvements in F1-score, precision, and recall of 0.023, 0.029, and 0.013, respectively. These results indicate that highly similar images can contribute towards the presence of performance degrading bias within the Diabetic Foot Ulcers Challenge 2021 dataset, and that the removal of images that are 80\% similar from the training set can help to boost classification performance.
IVApr 26, 2022
AAU-net: An Adaptive Attention U-net for Breast Lesions Segmentation in Ultrasound ImagesGongping Chen, Yu Dai, Jianxun Zhang et al.
Various deep learning methods have been proposed to segment breast lesion from ultrasound images. However, similar intensity distributions, variable tumor morphology and blurred boundaries present challenges for breast lesions segmentation, especially for malignant tumors with irregular shapes. Considering the complexity of ultrasound images, we develop an adaptive attention U-net (AAU-net) to segment breast lesions automatically and stably from ultrasound images. Specifically, we introduce a hybrid adaptive attention module, which mainly consists of a channel self-attention block and a spatial self-attention block, to replace the traditional convolution operation. Compared with the conventional convolution operation, the design of the hybrid adaptive attention module can help us capture more features under different receptive fields. Different from existing attention mechanisms, the hybrid adaptive attention module can guide the network to adaptively select more robust representation in channel and space dimensions to cope with more complex breast lesions segmentation. Extensive experiments with several state-of-the-art deep learning segmentation methods on three public breast ultrasound datasets show that our method has better performance on breast lesion segmentation. Furthermore, robustness analysis and external experiments demonstrate that our proposed AAU-net has better generalization performance on the segmentation of breast lesions. Moreover, the hybrid adaptive attention module can be flexibly applied to existing network frameworks.
CVMar 30, 2023
Why is the winner the best?Matthias Eisenmann, Annika Reinke, Vivienn Weru et al.
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.
IVApr 22, 2022
Translating Clinical Delineation of Diabetic Foot Ulcers into Machine Interpretable SegmentationConnah Kendrick, Bill Cassidy, Joseph M. Pappachan et al.
Diabetic foot ulcer is a severe condition that requires close monitoring and management. For training machine learning methods to auto-delineate the ulcer, clinical staff must provide ground truth annotations. In this paper, we propose a new diabetic foot ulcers dataset, namely DFUC2022, the largest segmentation dataset where ulcer regions were manually delineated by clinicians. We assess whether the clinical delineations are machine interpretable by deep learning networks or if image processing refined contour should be used. By providing benchmark results using a selection of popular deep learning algorithms, we draw new insights into the limitations of DFU wound delineation and report on the associated issues. This paper provides some observations on baseline models to facilitate DFUC2022 Challenge in conjunction with MICCAI 2022. The leaderboard will be ranked by Dice score, where the best FCN-based method is 0.5708 and DeepLabv3+ achieved the best score of 0.6277. This paper demonstrates that image processing using refined contour as ground truth can provide better agreement with machine predicted results. DFUC2022 will be released on the 27th April 2022.
CVApr 24, 2023
Diabetic Foot Ulcer Grand Challenge 2022 SummaryConnah Kendrick, Bill Cassidy, Neil D. Reeves et al.
The Diabetic Foot Ulcer Challenge 2022 focused on the task of diabetic foot ulcer segmentation, based on the work completed in previous DFU challenges. The challenge provided 4000 images of full-view foot ulcer images together with corresponding delineation of ulcer regions. This paper provides an overview of the challenge, a summary of the methods proposed by the challenge participants, the results obtained from each technique, and a comparison of the challenge results. The best-performing network was a modified HarDNet-MSEG, with a Dice score of 0.7287.
CVJun 23, 2023
Dermoscopic Dark Corner Artifacts Removal: Friend or Foe?Samuel William Pewton, Bill Cassidy, Connah Kendrick et al.
One of the more significant obstacles in classification of skin cancer is the presence of artifacts. This paper investigates the effect of dark corner artifacts, which result from the use of dermoscopes, on the performance of a deep learning binary classification task. Previous research attempted to remove and inpaint dark corner artifacts, with the intention of creating an ideal condition for models. However, such research has been shown to be inconclusive due to lack of available datasets labelled with dark corner artifacts and detailed analysis and discussion. To address these issues, we label 10,250 skin lesion images from publicly available datasets and introduce a balanced dataset with an equal number of melanoma and non-melanoma cases. The training set comprises 6126 images without artifacts, and the testing set comprises 4124 images with dark corner artifacts. We conduct three experiments to provide new understanding on the effects of dark corner artifacts, including inpainted and synthetically generated examples, on a deep learning method. Our results suggest that introducing synthetic dark corner artifacts which have been superimposed onto the training set improved model performance, particularly in terms of the true negative rate. This indicates that deep learning learnt to ignore dark corner artifacts, rather than treating it as melanoma, when dark corner artifacts were introduced into the training set. Further, we propose a new approach to quantifying heatmaps indicating network focus using a root mean square measure of the brightness intensity in the different regions of the heatmaps. This paper provides a new guideline for skin lesions analysis with an emphasis on reproducibility.
CVFeb 18
SemCovNet: Towards Fair and Semantic Coverage-Aware Learning for Underrepresented Visual ConceptsSakib Ahammed, Xia Cui, Xinqi Fan et al.
Modern vision models increasingly rely on rich semantic representations that extend beyond class labels to include descriptive concepts and contextual attributes. However, existing datasets exhibit Semantic Coverage Imbalance (SCI), a previously overlooked bias arising from the long-tailed semantic representations. Unlike class imbalance, SCI occurs at the semantic level, affecting how models learn and reason about rare yet meaningful semantics. To mitigate SCI, we propose Semantic Coverage-Aware Network (SemCovNet), a novel model that explicitly learns to correct semantic coverage disparities. SemCovNet integrates a Semantic Descriptor Map (SDM) for learning semantic representations, a Descriptor Attention Modulation (DAM) module that dynamically weights visual and concept features, and a Descriptor-Visual Alignment (DVA) loss that aligns visual features with descriptor semantics. We quantify semantic fairness using a Coverage Disparity Index (CDI), which measures the alignment between coverage and error. Extensive experiments across multiple datasets demonstrate that SemCovNet enhances model reliability and substantially reduces CDI, achieving fairer and more equitable performance. This work establishes SCI as a measurable and correctable bias, providing a foundation for advancing semantic fairness and interpretable vision learning.
IVMar 7, 2025Code
Gaussian Random Fields as an Abstract Representation of Patient Metadata for Multimodal Medical Image SegmentationBill Cassidy, Christian McBride, Connah Kendrick et al.
The growing rate of chronic wound occurrence, especially in patients with diabetes, has become a concerning trend in recent years. Chronic wounds are difficult and costly to treat, and have become a serious burden on health care systems worldwide. Chronic wounds can have devastating consequences for the patient, with infection often leading to reduced quality of life and increased mortality risk. Innovative deep learning methods for the detection and monitoring of such wounds have the potential to reduce the impact to both patient and clinician. We present a novel multimodal segmentation method which allows for the introduction of patient metadata into the training workflow whereby the patient data are expressed as Gaussian random fields. Our results indicate that the proposed method improved performance when utilising multiple models, each trained on different metadata categories. Using the Diabetic Foot Ulcer Challenge 2022 test set, when compared to the baseline results (intersection over union = 0.4670, Dice similarity coefficient = 0.5908) we demonstrate improvements of +0.0220 and +0.0229 for intersection over union and Dice similarity coefficient respectively. This paper presents the first study to focus on integrating patient data into a chronic wound segmentation workflow. Our results show significant performance gains when training individual models using specific metadata categories, followed by average merging of prediction masks using distance transforms. All source code for this study is available at: https://github.com/mmu-dermatology-research/multimodal-grf
CVOct 18, 2021Code
FacialGAN: Style Transfer and Attribute Manipulation on Synthetic FacesRicard Durall, Jireh Jam, Dominik Strassel et al.
Facial image manipulation is a generation task where the output face is shifted towards an intended target direction in terms of facial attribute and styles. Recent works have achieved great success in various editing techniques such as style transfer and attribute translation. However, current approaches are either focusing on pure style transfer, or on the translation of predefined sets of attributes with restricted interactivity. To address this issue, we propose FacialGAN, a novel framework enabling simultaneous rich style transfers and interactive facial attributes manipulation. While preserving the identity of a source image, we transfer the diverse styles of a target image to the source image. We then incorporate the geometry information of a segmentation mask to provide a fine-grained manipulation of facial attributes. Finally, a multi-objective learning strategy is introduced to optimize the loss of each specific tasks. Experiments on the CelebA-HQ dataset, with CelebAMask-HQ as semantic mask labels, show our model's capacity in producing visually compelling results in style transfer, attribute manipulation, diversity and face verification. For reproducibility, we provide an interactive open-source tool to perform facial manipulations, and the Pytorch implementation of the model.
IVNov 16, 2020Code
Deep learning in magnetic resonance prostate segmentation: A review and a new perspectiveDavid Gillespie, Connah Kendrick, Ian Boon et al.
Prostate radiotherapy is a well established curative oncology modality, which in future will use Magnetic Resonance Imaging (MRI)-based radiotherapy for daily adaptive radiotherapy target definition. However the time needed to delineate the prostate from MRI data accurately is a time consuming process. Deep learning has been identified as a potential new technology for the delivery of precision radiotherapy in prostate cancer, where accurate prostate segmentation helps in cancer detection and therapy. However, the trained models can be limited in their application to clinical setting due to different acquisition protocols, limited publicly available datasets, where the size of the datasets are relatively small. Therefore, to explore the field of prostate segmentation and to discover a generalisable solution, we review the state-of-the-art deep learning algorithms in MR prostate segmentation; provide insights to the field by discussing their limitations and strengths; and propose an optimised 2D U-Net for MR prostate segmentation. We evaluate the performance on four publicly available datasets using Dice Similarity Coefficient (DSC) as performance metric. Our experiments include within dataset evaluation and cross-dataset evaluation. The best result is achieved by composite evaluation (DSC of 0.9427 on Decathlon test set) and the poorest result is achieved by cross-dataset evaluation (DSC of 0.5892, Prostate X training set, Promise 12 testing set). We outline the challenges and provide recommendations for future work. Our research provides a new perspective to MR prostate segmentation and more importantly, we provide standardised experiment settings for researchers to evaluate their algorithms. Our code is available at https://github.com/AIEMMU/MRI\_Prostate.
CVDec 18, 2019Code
Spotting Macro- and Micro-expression Intervals in Long Video SequencesYing He, Su-Jing Wang, Jingting Li et al.
This paper presents baseline results for the Third Facial Micro-Expression Grand Challenge (MEGC 2020). Both macro- and micro-expression intervals in CAS(ME)$^2$ and SAMM Long Videos are spotted by employing the method of Main Directional Maximal Difference Analysis (MDMD). The MDMD method uses the magnitude maximal difference in the main direction of optical flow features to spot facial movements. The single-frame prediction results of the original MDMD method are post-processed into reasonable video intervals. The metric F1-scores of baseline results are evaluated: for CAS(ME)$^2$, the F1-scores are 0.1196 and 0.0082 for macro- and micro-expressions respectively, and the overall F1-score is 0.0376; for SAMM Long Videos, the F1-scores are 0.0629 and 0.0364 for macro- and micro-expressions respectively, and the overall F1-score is 0.0445. The baseline project codes are publicly available at https://github.com/HeyingGithub/Baseline-project-for-MEGC2020_spotting.
CVJun 18, 2025
MEGC2025: Micro-Expression Grand Challenge on Spot Then Recognize and Visual Question AnsweringXinqi Fan, Jingting Li, John See et al.
Facial micro-expressions (MEs) are involuntary movements of the face that occur spontaneously when a person experiences an emotion but attempts to suppress or repress the facial expression, typically found in a high-stakes environment. In recent years, substantial advancements have been made in the areas of ME recognition, spotting, and generation. However, conventional approaches that treat spotting and recognition as separate tasks are suboptimal, particularly for analyzing long-duration videos in realistic settings. Concurrently, the emergence of multimodal large language models (MLLMs) and large vision-language models (LVLMs) offers promising new avenues for enhancing ME analysis through their powerful multimodal reasoning capabilities. The ME grand challenge (MEGC) 2025 introduces two tasks that reflect these evolving research directions: (1) ME spot-then-recognize (ME-STR), which integrates ME spotting and subsequent recognition in a unified sequential pipeline; and (2) ME visual question answering (ME-VQA), which explores ME understanding through visual question answering, leveraging MLLMs or LVLMs to address diverse question types related to MEs. All participating algorithms are required to run on this test set and submit their results on a leaderboard. More details are available at https://megc2025.github.io.
CVMay 1, 2023
Venn Diagram Multi-label Class Interpretation of Diabetic Foot Ulcer with Color and Sharpness EnhancementMd Mahamudul Hasan, Moi Hoon Yap, Md Kamrul Hasan
DFU is a severe complication of diabetes that can lead to amputation of the lower limb if not treated properly. Inspired by the 2021 Diabetic Foot Ulcer Grand Challenge, researchers designed automated multi-class classification of DFU, including infection, ischaemia, both of these conditions, and none of these conditions. However, it remains a challenge as classification accuracy is still not satisfactory. This paper proposes a Venn Diagram interpretation of multi-label CNN-based method, utilizing different image enhancement strategies, to improve the multi-class DFU classification. We propose to reduce the four classes into two since both class wounds can be interpreted as the simultaneous occurrence of infection and ischaemia and none class wounds as the absence of infection and ischaemia. We introduce a novel Venn Diagram representation block in the classifier to interpret all four classes from these two classes. To make our model more resilient, we propose enhancing the perceptual quality of DFU images, particularly blurry or inconsistently lit DFU images, by performing color and sharpness enhancements on them. We also employ a fine-tuned optimization technique, adaptive sharpness aware minimization, to improve the CNN model generalization performance. The proposed method is evaluated on the test dataset of DFUC2021, containing 5,734 images and the results are compared with the top-3 winning entries of DFUC2021. Our proposed approach outperforms these existing approaches and achieves Macro-Average F1, Recall and Precision scores of 0.6592, 0.6593, and 0.6652, respectively.Additionally, We perform ablation studies and image quality measurements to further interpret our proposed method. This proposed method will benefit patients with DFUs since it tackles the inconsistencies in captured images and can be employed for a more robust remote DFU wound classification.
CVJan 2, 2022
V-LinkNet: Learning Contextual Inpainting Across Latent Space of Generative Adversarial NetworkJireh Jam, Connah Kendrick, Vincent Drouard et al.
Image inpainting is a key technique in image processing task to predict the missing regions and generate realistic images. Given the advancement of existing generative inpainting models with feature extraction, propagation and reconstruction capabilities, there is lack of high-quality feature extraction and transfer mechanisms in deeper layers to tackle persistent aberrations on the generated inpainted regions. Our method, V-LinkNet, develops high-level feature transference to deep level textural context of inpainted regions our work, proposes a novel technique of combining encoders learning through a recursive residual transition layer (RSTL). The RSTL layer easily adapts dual encoders by increasing the unique semantic information through direct communication. By collaborating the dual encoders structure with contextualised feature representation loss function, our system gains the ability to inpaint with high-level features. To reduce biases from random mask-image pairing, we introduce a standard protocol with paired mask-image on the testing set of CelebA-HQ, Paris Street View and Places2 datasets. Our results show V-LinkNet performed better on CelebA-HQ and Paris Street View using this standard protocol. We will share the standard protocol and our codes with the research community upon acceptance of this paper.
IVJan 1, 2022
Development of Diabetic Foot Ulcer Datasets: An OverviewMoi Hoon Yap, Connah Kendrick, Neil D. Reeves et al.
This paper provides conceptual foundation and procedures used in the development of diabetic foot ulcer datasets over the past decade, with a timeline to demonstrate progress. We conduct a survey on data capturing methods for foot photographs, an overview of research in developing private and public datasets, the related computer vision tasks (detection, segmentation and classification), the diabetic foot ulcer challenges and the future direction of the development of the datasets. We report the distribution of dataset users by country and year. Our aim is to share the technical challenges that we encountered together with good practices in dataset development, and provide motivation for other researchers to participate in data sharing in this domain.
IVNov 19, 2021
Diabetic Foot Ulcer Grand Challenge 2021: Evaluation and SummaryBill Cassidy, Connah Kendrick, Neil D. Reeves et al.
Diabetic foot ulcer classification systems use the presence of wound infection (bacteria present within the wound) and ischaemia (restricted blood supply) as vital clinical indicators for treatment and prediction of wound healing. Studies investigating the use of automated computerised methods of classifying infection and ischaemia within diabetic foot wounds are limited due to a paucity of publicly available datasets and severe data imbalance in those few that exist. The Diabetic Foot Ulcer Challenge 2021 provided participants with a more substantial dataset comprising a total of 15,683 diabetic foot ulcer patches, with 5,955 used for training, 5,734 used for testing and an additional 3,994 unlabelled patches to promote the development of semi-supervised and weakly-supervised deep learning techniques. This paper provides an evaluation of the methods used in the Diabetic Foot Ulcer Challenge 2021, and summarises the results obtained from each network. The best performing network was an ensemble of the results of the top 3 models, with a macro-average F1-score of 0.6307.
LGMay 17, 2021
A Cloud-based Deep Learning Framework for Remote Detection of Diabetic Foot UlcersBill Cassidy, Neil D. Reeves, Joseph M. Pappachan et al.
This research proposes a mobile and cloud-based framework for the automatic detection of diabetic foot ulcers and conducts an investigation of its performance. The system uses a cross-platform mobile framework which enables the deployment of mobile apps to multiple platforms using a single TypeScript code base. A deep convolutional neural network was deployed to a cloud-based platform where the mobile app could send photographs of patient's feet for inference to detect the presence of diabetic foot ulcers. The functionality and usability of the system were tested in two clinical settings: Salford Royal NHS Foundation Trust and Lancashire Teaching Hospitals NHS Foundation Trust. The benefits of the system, such as the potential use of the app by patients to identify and monitor their condition are discussed.
CVMay 13, 2021
3D-CNN for Facial Micro- and Macro-expression Spotting on Long Video Sequences using Temporal Oriented Reference FrameChuin Hong Yap, Moi Hoon Yap, Adrian K. Davison et al.
Facial expression spotting is the preliminary step for micro- and macro-expression analysis. The task of reliably spotting such expressions in video sequences is currently unsolved. The current best systems depend upon optical flow methods to extract regional motion features, before categorisation of that motion into a specific class of facial movement. Optical flow is susceptible to drift error, which introduces a serious problem for motions with long-term dependencies, such as high frame-rate macro-expression. We propose a purely deep learning solution which, rather than tracking frame differential motion, compares via a convolutional model, each frame with two temporally local reference frames. Reference frames are sampled according to calculated micro- and macro-expression duration. As baseline for MEGC2021 using leave-one-subject-out evaluation method, we show that our solution achieves F1-score of 0.105 in a high frame-rate (200 fps) SAMM long videos dataset (SAMM-LV) and is competitive in a low frame-rate (30 fps) (CAS(ME)2) dataset. On unseen MEGC2022 challenge dataset, the baseline results are 0.1176 on SAMM Challenge dataset, 0.1739 on CAS(ME)3 and overall performance of 0.1531 on both dataset.
CVMay 7, 2021
Foreground-guided Facial Inpainting with Fidelity PreservationJireh Jam, Connah Kendrick, Vincent Drouard et al.
Facial image inpainting, with high-fidelity preservation for image realism, is a very challenging task. This is due to the subtle texture in key facial features (component) that are not easily transferable. Many image inpainting techniques have been proposed with outstanding capabilities and high quantitative performances recorded. However, with facial inpainting, the features are more conspicuous and the visual quality of the blended inpainted regions are more important qualitatively. Based on these facts, we design a foreground-guided facial inpainting framework that can extract and generate facial features using convolutional neural network layers. It introduces the use of foreground segmentation masks to preserve the fidelity. Specifically, we propose a new loss function with semantic capability reasoning of facial expressions, natural and unnatural features (make-up). We conduct our experiments using the CelebA-HQ dataset, segmentation masks from CelebAMask-HQ (for foreground guidance) and Quick Draw Mask (for missing regions). Our proposed method achieved comparable quantitative results when compare to the state of the art but qualitatively, it demonstrated high-fidelity preservation of facial components.
CVApr 7, 2021
Analysis Towards Classification of Infection and Ischaemia of Diabetic Foot UlcersMoi Hoon Yap, Bill Cassidy, Joseph M. Pappachan et al.
This paper introduces the Diabetic Foot Ulcers dataset (DFUC2021) for analysis of pathology, focusing on infection and ischaemia. We describe the data preparation of DFUC2021 for ground truth annotation, data curation and data analysis. The final release of DFUC2021 consists of 15,683 DFU patches, with 5,955 training, 5,734 for testing and 3,994 unlabeled DFU patches. The ground truth labels are four classes, i.e. control, infection, ischaemia and both conditions. We curate the dataset using image hashing techniques and analyse the separability using UMAP projection. We benchmark the performance of five key backbones of deep learning, i.e. VGG16, ResNet101, InceptionV3, DenseNet121 and EfficientNet on DFUC2021. We report the optimised results of these key backbones with different strategies. Based on our observations, we conclude that EfficientNetB0 with data augmentation and transfer learning provided the best results for multi-class (4-class) classification with macro-average Precision, Recall and F1-score of 0.57, 0.62 and 0.55, respectively. In ischaemia and infection recognition, when trained on one-versus-all, EfficientNetB0 achieved comparable results with the state of the art. Finally, we interpret the results with statistical analysis and Grad-CAM visualisation.
CVOct 7, 2020
Deep Learning in Diabetic Foot Ulcers Detection: A Comprehensive EvaluationMoi Hoon Yap, Ryo Hachiuma, Azadeh Alavi et al.
There has been a substantial amount of research involving computer methods and technology for the detection and recognition of diabetic foot ulcers (DFUs), but there is a lack of systematic comparisons of state-of-the-art deep learning object detection frameworks applied to this problem. DFUC2020 provided participants with a comprehensive dataset consisting of 2,000 images for training and 2,000 images for testing. This paper summarises the results of DFUC2020 by comparing the deep learning-based algorithms proposed by the winning teams: Faster R-CNN, three variants of Faster R-CNN and an ensemble method; YOLOv3; YOLOv5; EfficientDet; and a new Cascade Attention Network. For each deep learning method, we provide a detailed description of model architecture, parameter settings for training and additional stages including pre-processing, data augmentation and post-processing. We provide a comprehensive evaluation for each method. All the methods required a data augmentation stage to increase the number of images available for training and a post-processing stage to remove false positives. The best performance was obtained from Deformable Convolution, a variant of Faster R-CNN, with a mean average precision (mAP) of 0.6940 and an F1-Score of 0.7434. Finally, we demonstrate that the ensemble method based on different deep learning methods can enhanced the F1-Score but not the mAP.
CVAug 11, 2020
R-MNet: A Perceptual Adversarial Network for Image InpaintingJireh Jam, Connah Kendrick, Vincent Drouard et al.
Facial image inpainting is a problem that is widely studied, and in recent years the introduction of Generative Adversarial Networks, has led to improvements in the field. Unfortunately some issues persists, in particular when blending the missing pixels with the visible ones. We address the problem by proposing a Wasserstein GAN combined with a new reverse mask operator, namely Reverse Masking Network (R-MNet), a perceptual adversarial network for image inpainting. The reverse mask operator transfers the reverse masked image to the end of the encoder-decoder network leaving only valid pixels to be inpainted. Additionally, we propose a new loss function computed in feature space to target only valid pixels combined with adversarial training. These then capture data distributions and generate images similar to those in the training data with achieved realism (realistic and coherent) on the output images. We evaluate our method on publicly available dataset, and compare with state-of-the-art methods. We show that our method is able to generalize to high-resolution inpainting task, and further show more realistic outputs that are plausible to the human visual system when compared with the state-of-the-art methods.
CVApr 24, 2020
DFUC2020: Analysis Towards Diabetic Foot Ulcer DetectionBill Cassidy, Neil D. Reeves, Pappachan Joseph et al.
Every 20 seconds, a limb is amputated somewhere in the world due to diabetes. This is a global health problem that requires a global solution. The MICCAI challenge discussed in this paper, which concerns the automated detection of diabetic foot ulcers using machine learning techniques, will accelerate the development of innovative healthcare technology to address this unmet medical need. In an effort to improve patient care and reduce the strain on healthcare systems, recent research has focused on the creation of cloud-based detection algorithms. These can be consumed as a service by a mobile app that patients (or a carer, partner or family member) could use themselves at home to monitor their condition and to detect the appearance of a diabetic foot ulcer (DFU). Collaborative work between Manchester Metropolitan University, Lancashire Teaching Hospital and the Manchester University NHS Foundation Trust has created a repository of 4,000 DFU images for the purpose of supporting research toward more advanced methods of DFU detection. Based on a joint effort involving the lead scientists of the UK, US, India and New Zealand, this challenge will solicit original work, and promote interactions between researchers and interdisciplinary collaborations. This paper presents a dataset description and analysis, assessment methods, benchmark algorithms and initial evaluation results. It facilitates the challenge by providing useful insights into state-of-the-art and ongoing research. This grand challenge takes on even greater urgency in a peri and post-pandemic period, where stresses on resource utilization will increase the need for technology that allows people to remain active, healthy and intact in their home.
IVMar 6, 2020
Anysize GAN: A solution to the image-warping problemConnah Kendrick, David Gillespie, Moi Hoon Yap
We propose a new type of General Adversarial Network (GAN) to resolve a common issue with Deep Learning. We develop a novel architecture that can be applied to existing latent vector based GAN structures that allows them to generate on-the-fly images of any size. Existing GAN for image generation requires uniform images of matching dimensions. However, publicly available datasets, such as ImageNet contain thousands of different sizes. Resizing image causes deformations and changing the image data, whereas as our network does not require this preprocessing step. We make significant changes to the standard data loading techniques to enable any size image to be loaded for training. We also modify the network in two ways, by adding multiple inputs and a novel dynamic resizing layer. Finally we make adjustments to the discriminator to work on multiple resolutions. These changes can allow multiple resolution datasets to be trained on without any resizing, if memory allows. We validate our results on the ISIC 2019 skin lesion dataset. We demonstrate our method can successfully generate realistic images at different sizes without issue, preserving and understanding spatial relationships, while maintaining feature relationships. We will release the source codes upon paper acceptance.
CVJan 11, 2020
Symmetric Skip Connection Wasserstein GAN for High-Resolution Facial Image InpaintingJireh Jam, Connah Kendrick, Vincent Drouard et al.
The state-of-the-art facial image inpainting methods achieved promising results but face realism preservation remains a challenge. This is due to limitations such as; failures in preserving edges and blurry artefacts. To overcome these limitations, we propose a Symmetric Skip Connection Wasserstein Generative Adversarial Network (S-WGAN) for high-resolution facial image inpainting. The architecture is an encoder-decoder with convolutional blocks, linked by skip connections. The encoder is a feature extractor that captures data abstractions of an input image to learn an end-to-end mapping from an input (binary masked image) to the ground-truth. The decoder uses learned abstractions to reconstruct the image. With skip connections, S-WGAN transfers image details to the decoder. Additionally, we propose a Wasserstein-Perceptual loss function to preserve colour and maintain realism on a reconstructed image. We evaluate our method and the state-of-the-art methods on CelebA-HQ dataset. Our results show S-WGAN produces sharper and more realistic images when visually compared with other methods. The quantitative measures show our proposed S-WGAN achieves the best Structure Similarity Index Measure (SSIM) of 0.94.
CVNov 4, 2019
SAMM Long Videos: A Spontaneous Facial Micro- and Macro-Expressions DatasetChuin Hong Yap, Connah Kendrick, Moi Hoon Yap
With the growth of popularity of facial micro-expressions in recent years, the demand for long videos with micro- and macro-expressions remains high. Extended from SAMM, a micro-expressions dataset released in 2016, this paper presents SAMM Long Videos dataset for spontaneous micro- and macro-expressions recognition and spotting. SAMM Long Videos dataset consists of 147 long videos with 343 macro-expressions and 159 micro-expressions. The dataset is FACS-coded with detailed Action Units (AUs). We compare our dataset with Chinese Academy of Sciences Macro-Expressions and Micro-Expressions (CAS(ME)2) dataset, which is the only available fully annotated dataset with micro- and macro-expressions. Furthermore, we preprocess the long videos using OpenFace, which includes face alignment and detection of facial AUs. We conduct facial expression spotting using this dataset and compare it with the baseline of MEGC III. Our spotting method outperformed the baseline result with F1-score of 0.3299.
IVAug 14, 2019
Recognition of Ischaemia and Infection in Diabetic Foot Ulcers: Dataset and TechniquesManu Goyal, Neil Reeves, Satyan Rajbhandari et al.
Recognition and analysis of Diabetic Foot Ulcers (DFU) using computerized methods is an emerging research area with the evolution of image-based machine learning algorithms. Existing research using visual computerized methods mainly focuses on recognition, detection, and segmentation of the visual appearance of the DFU as well as tissue classification. According to DFU medical classification systems, the presence of infection (bacteria in the wound) and ischaemia (inadequate blood supply) has important clinical implications for DFU assessment, which are used to predict the risk of amputation. In this work, we propose a new dataset and computer vision techniques to identify the presence of infection and ischaemia in DFU. This is the first time a DFU dataset with ground truth labels of ischaemia and infection cases is introduced for research purposes. For the handcrafted machine learning approach, we propose a new feature descriptor, namely the Superpixel Color Descriptor. Then we use the Ensemble Convolutional Neural Network (CNN) model for more effective recognition of ischaemia and infection. We propose to use a natural data-augmentation method, which identifies the region of interest on foot images and focuses on finding the salient features existing in this area. Finally, we evaluate the performance of our proposed techniques on binary classification, i.e. ischaemia versus non-ischaemia and infection versus non-infection. Overall, our method performed better in the classification of ischaemia than infection. We found that our proposed Ensemble CNN deep learning algorithms performed better for both classification tasks as compared to handcrafted machine learning algorithms, with 90% accuracy in ischaemia classification and 73% in infection classification.
IVFeb 2, 2019
Automatic Lesion Boundary Segmentation in Dermoscopic Images with Ensemble Deep Learning MethodsManu Goyal, Amanda Oakley, Priyanka Bansal et al.
Early detection of skin cancer, particularly melanoma, is crucial to enable advanced treatment. Due to the rapid growth in the numbers of skin cancers, there is a growing need of computerized analysis for skin lesions. The state-of-the-art public available datasets for skin lesions are often accompanied with very limited amount of segmentation ground truth labeling as it is laborious and expensive. The lesion boundary segmentation is vital to locate the lesion accurately in dermoscopic images and lesion diagnosis of different skin lesion types. In this work, we propose the use of fully automated deep learning ensemble methods for accurate lesion boundary segmentation in dermoscopic images. We trained the Mask-RCNN and DeepLabv3+ methods on ISIC-2017 segmentation training set and evaluate the performance of the ensemble networks on ISIC-2017 testing set. Our results showed that the best proposed ensemble method segmented the skin lesions with Jaccard index of 79.58% for the ISIC-2017 testing set. The proposed ensemble method outperformed FrCN, FCN, U-Net, and SegNet in Jaccard Index by 2.48%, 7.42%, 17.95%, and 9.96% respectively. Furthermore, the proposed ensemble method achieved an accuracy of 95.6% for some representative clinically benign cases, 90.78% for the melanoma cases, and 91.29% for the seborrheic keratosis cases on ISIC-2017 testing set, exhibiting better performance than FrCN, FCN, U-Net, and SegNet.
CVDec 26, 2018
Spotting Micro-Expressions on Long Videos SequencesJingting Li, Catherine Soladie, Renaud Sguier et al.
This paper presents baseline results for the first Micro-Expression Spotting Challenge 2019 by evaluating local temporal pattern (LTP) on SAMM and CAS(ME)2. The proposed LTP patterns are extracted by applying PCA in a temporal window on several facial local regions. The micro-expression sequences are then spotted by a local classification of LTP and a global fusion. The performance is evaluated by Leave-One-Subject-Out cross validation. Furthermore, we define the criteria of determining true positives in one video by overlap rate and set the metric F1-score for spotting performance of the whole database. The F1-score of baseline results for SAMM and CAS(ME)2 are 0.0316 and 0.0179, respectively.
CVJul 27, 2018
Deep Learning Methods and Applications for Region of Interest Detection in Dermoscopic ImagesManu Goyal, Moi Hoon Yap, Saeed Hassanpour
Rapid growth in the development of medical imaging analysis technology has been propelled by the great interest in improving computer-aided diagnosis and detection (CAD) systems for three popular image visualization tasks: classification, segmentation, and Region of Interest (ROI) detection. However, a limited number of datasets with ground truth annotations are available for developing segmentation and ROI detection of lesions, as expert annotations are laborious and expensive. Detecting the ROI is vital to locate lesions accurately. In this paper, we propose the use of two deep object detection meta-architectures (Faster R-CNN Inception-V2 and SSD Inception-V2) to develop robust ROI detection of skin lesions in dermoscopic datasets (2017 ISIC Challenge, PH2, and HAM10000), and compared the performance with state-of-the-art segmentation algorithm (DeeplabV3+). To further demonstrate the potential of our work, we built a smartphone application for real-time automated detection of skin lesions based on this methodology. In addition, we developed an automated natural data-augmentation method from ROI detection to produce augmented copies of dermoscopic images, as a pre-processing step in the segmentation of skin lesions to further improve the performance of the current state-of-the-art deep learning algorithm. Our proposed ROI detection has the potential to more appropriately streamline dermatology referrals and reduce unnecessary biopsies in the diagnosis of skin cancer.
CVJul 24, 2018
Multi-Class Lesion Diagnosis with Pixel-wise Classification NetworkManu Goyal, Jiahua Ng, Moi Hoon Yap
Lesion diagnosis of skin lesions is a very challenging task due to high inter-class similarities and intra-class variations in terms of color, size, site and appearance among different skin lesions. With the emergence of computer vision especially deep learning algorithms, lesion diagnosis is made possible using these algorithms trained on dermoscopic images. Usually, deep classification networks are used for the lesion diagnosis to determine different types of skin lesions. In this work, we used pixel-wise classification network to provide lesion diagnosis rather than classification network. We propose to use DeeplabV3+ for multi-class lesion diagnosis in dermoscopic images of Task 3 of ISIC Challenge 2018. We used various post-processing methods with DeeplabV3+ to determine the lesion diagnosis in this challenge and submitted the test results.
CVMay 7, 2018
A Review on Facial Micro-Expressions Analysis: Datasets, Features and MetricsWalied Merghani, Adrian K. Davison, Moi Hoon Yap
Facial micro-expressions are very brief, spontaneous facial expressions that appear on the face of humans when they either deliberately or unconsciously conceal an emotion. Micro-expression has shorter duration than macro-expression, which makes it more challenging for human and machine. Over the past ten years, automatic micro-expressions recognition has attracted increasing attention from researchers in psychology, computer science, security, neuroscience and other related disciplines. The aim of this paper is to provide the insights of automatic micro-expressions and recommendations for future research. There has been a lot of datasets released over the last decade that facilitated the rapid growth in this field. However, comparison across different datasets is difficult due to the inconsistency in experiment protocol, features used and evaluation methods. To address these issues, we review the datasets, features and the performance metrics deployed in the literature. Relevant challenges such as the spatial temporal settings during data collection, emotional classes versus objective classes in data labelling, face regions in data analysis, standardisation of metrics and the requirements for real-world implementation are discussed. We conclude by proposing some promising future directions to advancing micro-expressions research.
CVJan 1, 2018
Semantic Segmentation of Human Thigh Quadriceps Muscle in Magnetic Resonance ImagesEzak Ahmad, Manu Goyal, Jamie S. McPhee et al.
This paper presents an end-to-end solution for MRI thigh quadriceps segmentation. This is the first attempt that deep learning methods are used for the MRI thigh segmentation task. We use the state-of-the-art Fully Convolutional Networks with transfer learning approach for the semantic segmentation of regions of interest in MRI thigh scans. To further improve the performance of the segmentation, we propose a post-processing technique using basic image processing methods. With our proposed method, we have established a new benchmark for MRI thigh quadriceps segmentation with mean Jaccard Similarity Index of 0.9502 and processing time of 0.117 second per image.
CVNov 28, 2017
Multi-class Semantic Segmentation of Skin Lesions via Fully Convolutional NetworksManu Goyal, Moi Hoon Yap, Saeed Hassanpour
Melanoma is clinically difficult to distinguish from common benign skin lesions, particularly melanocytic naevus and seborrhoeic keratosis. The dermoscopic appearance of these lesions has huge intra-class variations and high inter-class visual similarities. Most current research is focusing on single-class segmentation irrespective of classes of skin lesions. In this work, we evaluate the performance of deep learning on multi-class segmentation of ISIC-2017 challenge dataset, which consists of 2,750 dermoscopic images. We propose an end-to-end solution using fully convolutional networks (FCNs) for multi-class semantic segmentation to automatically segment the melanoma, seborrhoeic keratosis and naevus. To improve the performance of FCNs, transfer learning and a hybrid loss function are used. We evaluate the performance of the deep learning segmentation methods for multi-class segmentation and lesion diagnosis (with post-processing method) on the testing set of the ISIC-2017 challenge dataset. The results showed that the two-tier level transfer learning FCN-8s achieved the overall best result with \textit{Dice} score of 78.5% in a naevus category, 65.3% in melanoma, and 55.7% in seborrhoeic keratosis in multi-class segmentation and Accuracy of 84.62% for recognition of melanoma in lesion diagnosis.
CVNov 28, 2017
DFUNet: Convolutional Neural Networks for Diabetic Foot Ulcer ClassificationManu Goyal, Neil D. Reeves, Adrian K. Davison et al.
Globally, in 2016, one out of eleven adults suffered from Diabetes Mellitus. Diabetic Foot Ulcers (DFU) are a major complication of this disease, which if not managed properly can lead to amputation. Current clinical approaches to DFU treatment rely on patient and clinician vigilance, which has significant limitations such as the high cost involved in the diagnosis, treatment and lengthy care of the DFU. We collected an extensive dataset of foot images, which contain DFU from different patients. In this paper, we have proposed the use of traditional computer vision features for detecting foot ulcers among diabetic patients, which represent a cost-effective, remote and convenient healthcare solution. Furthermore, we used Convolutional Neural Networks (CNNs) for the first time in DFU classification. We have proposed a novel convolutional neural network architecture, DFUNet, with better feature extraction to identify the feature differences between healthy skin and the DFU. Using 10-fold cross-validation, DFUNet achieved an AUC score of 0.962. This outperformed both the machine learning and deep learning classifiers we have tested. Here we present the development of a novel and highly sensitive DFUNet for objectively detecting the presence of DFUs. This novel approach has the potential to deliver a paradigm shift in diabetic foot care.
CVAug 24, 2017
Objective Classes for Micro-Facial Expression RecognitionAdrian K. Davison, Walied Merghani, Moi Hoon Yap
Micro-expressions are brief spontaneous facial expressions that appear on a face when a person conceals an emotion, making them different to normal facial expressions in subtlety and duration. Currently, emotion classes within the CASME II dataset are based on Action Units and self-reports, creating conflicts during machine learning training. We will show that classifying expressions using Action Units, instead of predicted emotion, removes the potential bias of human reporting. The proposed classes are tested using LBP-TOP, HOOF and HOG 3D feature descriptors. The experiments are evaluated on two benchmark FACS coded datasets: CASME II and SAMM. The best result achieves 86.35\% accuracy when classifying the proposed 5 classes on CASME II using HOG 3D, outperforming the result of the state-of-the-art 5-class emotional-based classification in CASME II. Results indicate that classification based on Action Units provides an objective method to improve micro-expression recognition.
CVAug 6, 2017
Fully Convolutional Networks for Diabetic Foot Ulcer SegmentationManu Goyal, Neil D. Reeves, Satyan Rajbhandari et al.
Diabetic Foot Ulcer (DFU) is a major complication of Diabetes, which if not managed properly can lead to amputation. DFU can appear anywhere on the foot and can vary in size, colour, and contrast depending on various pathologies. Current clinical approaches to DFU treatment rely on patients and clinician vigilance, which has significant limitations such as the high cost involved in the diagnosis, treatment and lengthy care of the DFU. We introduce a dataset of 705 foot images. We provide the ground truth of ulcer region and the surrounding skin that is an important indicator for clinicians to assess the progress of ulcer. Then, we propose a two-tier transfer learning from bigger datasets to train the Fully Convolutional Networks (FCNs) to automatically segment the ulcer and surrounding skin. Using 5-fold cross-validation, the proposed two-tier transfer learning FCN Models achieve a Dice Similarity Coefficient of 0.794 ($\pm$0.104) for ulcer region, 0.851 ($\pm$0.148) for surrounding skin region, and 0.899 ($\pm$0.072) for the combination of both regions. This demonstrates the potential of FCNs in DFU segmentation, which can be further improved with a larger dataset.
CVAug 6, 2017
Automated Assessment of Facial Wrinkling: a case study on the effect of smokingOmaima FathElrahman Osman, Remah Mutasim Ibrahim Elbashir, Imad Eldain Abbass et al.
Facial wrinkle is one of the most prominent biological changes that accompanying the natural aging process. However, there are some external factors contributing to premature wrinkles development, such as sun exposure and smoking. Clinical studies have shown that heavy smoking causes premature wrinkles development. However, there is no computerised system that can automatically assess the facial wrinkles on the whole face. This study investigates the effect of smoking on facial wrinkling using a social habit face dataset and an automated computerised computer vision algorithm. The wrinkles pattern represented in the intensity of 0-255 was first extracted using a modified Hybrid Hessian Filter. The face was divided into ten predefined regions, where the wrinkles in each region was extracted. Then the statistical analysis was performed to analyse which region is effected mainly by smoking. The result showed that the density of wrinkles for smokers in two regions around the mouth was significantly higher than the non-smokers, at p-value of 0.05. Other regions are inconclusive due to lack of large scale dataset. Finally, the wrinkle was visually compared between smoker and non-smoker faces by generating a generic 3D face model.
CVJul 27, 2017
A Comparative Study of the Clinical use of Motion Analysis from Kinect Skeleton DataSean Maudsley-Barton, Jamie McPheey, Anthony Bukowski et al.
The analysis of human motion as a clinical tool can bring many benefits such as the early detection of disease and the monitoring of recovery, so in turn helping people to lead independent lives. However, it is currently under used. Developments in depth cameras, such as Kinect, have opened up the use of motion analysis in settings such as GP surgeries, care homes and private homes. To provide an insight into the use of Kinect in the healthcare domain, we present a review of the current state of the art. We then propose a method that can represent human motions from time-series data of arbitrary length, as a single vector. Finally, we demonstrate the utility of this method by extracting a set of clinically significant features and using them to detect the age related changes in the motions of a set of 54 individuals, with a high degree of certainty (F1- score between 0.9 - 1.0). Indicating its potential application in the detection of a range of age-related motion impairments.
CVDec 15, 2016
Objective Micro-Facial Movement Detection Using FACS-Based Regions and Baseline EvaluationAdrian K. Davison, Cliff Lansley, Choon Ching Ng et al.
Micro-facial expressions are regarded as an important human behavioural event that can highlight emotional deception. Spotting these movements is difficult for humans and machines, however research into using computer vision to detect subtle facial expressions is growing in popularity. This paper proposes an individualised baseline micro-movement detection method using 3D Histogram of Oriented Gradients (3D HOG) temporal difference method. We define a face template consisting of 26 regions based on the Facial Action Coding System (FACS). We extract the temporal features of each region using 3D HOG. Then, we use Chi-square distance to find subtle facial motion in the local regions. Finally, an automatic peak detector is used to detect micro-movements above the newly proposed adaptive baseline threshold. The performance is validated on two FACS coded datasets: SAMM and CASME II. This objective method focuses on the movement of the 26 face regions. When comparing with the ground truth, the best result was an AUC of 0.7512 and 0.7261 on SAMM and CASME II, respectively. The results show that 3D HOG outperformed for micro-movement detection, compared to state-of-the-art feature representations: Local Binary Patterns in Three Orthogonal Planes and Histograms of Oriented Optical Flow.