Yunan Wu

h-index12

9papers

143citations

Novelty44%

AI Score37

Ranked #89,814 of 194,257 authors (top 46%)#30,204 in CV (top 51%)

9 Papers

4.8CVJul 26, 2022

Can Deep Learning Assist Automatic Identification of Layered Pigments From XRF Data?

Bingjie, Xu, Yunan Wu et al.

X-ray fluorescence spectroscopy (XRF) plays an important role for elemental analysis in a wide range of scientific fields, especially in cultural heritage. XRF imaging, which uses a raster scan to acquire spectra across artworks, provides the opportunity for spatial analysis of pigment distributions based on their elemental composition. However, conventional XRF-based pigment identification relies on time-consuming elemental mapping by expert interpretations of measured spectra. To reduce the reliance on manual work, recent studies have applied machine learning techniques to cluster similar XRF spectra in data analysis and to identify the most likely pigments. Nevertheless, it is still challenging for automatic pigment identification strategies to directly tackle the complex structure of real paintings, e.g. pigment mixtures and layered pigments. In addition, pixel-wise pigment identification based on XRF imaging remains an obstacle due to the high noise level compared with averaged spectra. Therefore, we developed a deep-learning-based end-to-end pigment identification framework to fully automate the pigment identification process. In particular, it offers high sensitivity to the underlying pigments and to the pigments with a low concentration, therefore enabling satisfying results in mapping the pigments based on single-pixel XRF spectrum. As case studies, we applied our framework to lab-prepared mock-up paintings and two 19th-century paintings: Paul Gauguin's Poèmes Barbares (1896) that contains layered pigments with an underlying painting, and Paul Cezanne's The Bathers (1899-1904). The pigment identification results demonstrated that our model achieved comparable results to the analysis by elemental mapping, suggesting the generalizability and stability of our model.

7.3IVJul 18, 2023Code

Smooth Attention for Deep Multiple Instance Learning: Application to CT Intracranial Hemorrhage Detection

Yunan Wu, Francisco M. Castro-Macías, Pablo Morales-Álvarez et al.

Multiple Instance Learning (MIL) has been widely applied to medical imaging diagnosis, where bag labels are known and instance labels inside bags are unknown. Traditional MIL assumes that instances in each bag are independent samples from a given distribution. However, instances are often spatially or sequentially ordered, and one would expect similar diagnostic importance for neighboring instances. To address this, in this study, we propose a smooth attention deep MIL (SA-DMIL) model. Smoothness is achieved by the introduction of first and second order constraints on the latent function encoding the attention paid to each instance in a bag. The method is applied to the detection of intracranial hemorrhage (ICH) on head CT scans. The results show that this novel SA-DMIL: (a) achieves better performance than the non-smooth attention MIL at both scan (bag) and slice (instance) levels; (b) learns spatial dependencies between slices; and (c) outperforms current state-of-the-art MIL methods on the same ICH test set.

3.0IVJan 20, 2023

DeepCOVID-Fuse: A Multi-modality Deep Learning Model Fusing Chest X-Radiographs and Clinical Variables to Predict COVID-19 Risk Levels

Yunan Wu, Amil Dravid, Ramsey Michael Wehbe et al.

Propose: To present DeepCOVID-Fuse, a deep learning fusion model to predict risk levels in patients with confirmed coronavirus disease 2019 (COVID-19) and to evaluate the performance of pre-trained fusion models on full or partial combination of chest x-ray (CXRs) or chest radiograph and clinical variables. Materials and Methods: The initial CXRs, clinical variables and outcomes (i.e., mortality, intubation, hospital length of stay, ICU admission) were collected from February 2020 to April 2020 with reverse-transcription polymerase chain reaction (RT-PCR) test results as the reference standard. The risk level was determined by the outcome. The fusion model was trained on 1657 patients (Age: 58.30 +/- 17.74; Female: 807) and validated on 428 patients (56.41 +/- 17.03; 190) from Northwestern Memorial HealthCare system and was tested on 439 patients (56.51 +/- 17.78; 205) from a single holdout hospital. Performance of pre-trained fusion models on full or partial modalities were compared on the test set using the DeLong test for the area under the receiver operating characteristic curve (AUC) and the McNemar test for accuracy, precision, recall and F1. Results: The accuracy of DeepCOVID-Fuse trained on CXRs and clinical variables is 0.658, with an AUC of 0.842, which significantly outperformed (p < 0.05) models trained only on CXRs with an accuracy of 0.621 and AUC of 0.807 and only on clinical variables with an accuracy of 0.440 and AUC of 0.502. The pre-trained fusion model with only CXRs as input increases accuracy to 0.632 and AUC to 0.813 and with only clinical variables as input increases accuracy to 0.539 and AUC to 0.733. Conclusion: The fusion model learns better feature representations across different modalities during training and achieves good outcome predictions even when only some of the modalities are used in testing.

2.0CVApr 23, 2024Code

Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

Yi Li, Yunan Wu, Aggelos K. Katsaggelos

The advancement of The Laser Interferometer Gravitational-Wave Observatory (LIGO) has significantly enhanced the feasibility and reliability of gravitational wave detection. However, LIGO's high sensitivity makes it susceptible to transient noises known as glitches, which necessitate effective differentiation from real gravitational wave signals. Traditional approaches predominantly employ fully supervised or semi-supervised algorithms for the task of glitch classification and clustering. In the future task of identifying and classifying glitches across main and auxiliary channels, it is impractical to build a dataset with manually labeled ground-truth. In addition, the patterns of glitches can vary with time, generating new glitches without manual labels. In response to this challenge, we introduce the Cross-Temporal Spectrogram Autoencoder (CTSAE), a pioneering unsupervised method for the dimensionality reduction and clustering of gravitational wave glitches. CTSAE integrates a novel four-branch autoencoder with a hybrid of Convolutional Neural Networks (CNN) and Vision Transformers (ViT). To further extract features across multi-branches, we introduce a novel multi-branch fusion method using the CLS (Class) token. Our model, trained and evaluated on the GravitySpy O3 dataset on the main channel, demonstrates superior performance in clustering tasks when compared to state-of-the-art semi-supervised learning methods. To the best of our knowledge, CTSAE represents the first unsupervised approach tailored specifically for clustering LIGO data, marking a significant step forward in the field of gravitational wave research. The code of this paper is available at https://github.com/Zod-L/CTSAE

4.6LGMar 21, 2024Code

Hyperbolic Secant representation of the logistic function: Application to probabilistic Multiple Instance Learning for CT intracranial hemorrhage detection

F. M. Castro-Macías, P. Morales-Álvarez, Y. Wu et al.

Multiple Instance Learning (MIL) is a weakly supervised paradigm that has been successfully applied to many different scientific areas and is particularly well suited to medical imaging. Probabilistic MIL methods, and more specifically Gaussian Processes (GPs), have achieved excellent results due to their high expressiveness and uncertainty quantification capabilities. One of the most successful GP-based MIL methods, VGPMIL, resorts to a variational bound to handle the intractability of the logistic function. Here, we formulate VGPMIL using Pólya-Gamma random variables. This approach yields the same variational posterior approximations as the original VGPMIL, which is a consequence of the two representations that the Hyperbolic Secant distribution admits. This leads us to propose a general GP-based MIL method that takes different forms by simply leveraging distributions other than the Hyperbolic Secant one. Using the Gamma distribution we arrive at a new approach that obtains competitive or superior predictive performance and efficiency. This is validated in a comprehensive experimental study including one synthetic MIL dataset, two well-known MIL benchmarks, and a real-world medical problem. We expect that this work provides useful ideas beyond MIL that can foster further research in the field.

6.2CVJul 20, 2025

Probabilistic smooth attention for deep multiple instance learning in medical imaging

Francisco M. Castro-Macías, Pablo Morales-Álvarez, Yunan Wu et al.

The Multiple Instance Learning (MIL) paradigm is attracting plenty of attention in medical imaging classification, where labeled data is scarce. MIL methods cast medical images as bags of instances (e.g. patches in whole slide images, or slices in CT scans), and only bag labels are required for training. Deep MIL approaches have obtained promising results by aggregating instance-level representations via an attention mechanism to compute the bag-level prediction. These methods typically capture both local interactions among adjacent instances and global, long-range dependencies through various mechanisms. However, they treat attention values deterministically, potentially overlooking uncertainty in the contribution of individual instances. In this work we propose a novel probabilistic framework that estimates a probability distribution over the attention values, and accounts for both global and local interactions. In a comprehensive evaluation involving {\color{review} eleven} state-of-the-art baselines and three medical datasets, we show that our approach achieves top predictive performance in different metrics. Moreover, the probabilistic treatment of the attention provides uncertainty maps that are interpretable in terms of illness localization.

1.4CVJan 22, 2022Code

Investigating the Potential of Auxiliary-Classifier GANs for Image Classification in Low Data Regimes

Amil Dravid, Florian Schiffers, Yunan Wu et al.

Generative Adversarial Networks (GANs) have shown promise in augmenting datasets and boosting convolutional neural networks' (CNN) performance on image classification tasks. But they introduce more hyperparameters to tune as well as the need for additional time and computational power to train supplementary to the CNN. In this work, we examine the potential for Auxiliary-Classifier GANs (AC-GANs) as a 'one-stop-shop' architecture for image classification, particularly in low data regimes. Additionally, we explore modifications to the typical AC-GAN framework, changing the generator's latent space sampling scheme and employing a Wasserstein loss with gradient penalty to stabilize the simultaneous training of image synthesis and classification. Through experiments on images of varying resolutions and complexity, we demonstrate that AC-GANs show promise in image classification, achieving competitive performance with standard CNNs. These methods can be employed as an 'all-in-one' framework with particular utility in the absence of large amounts of training data.

8.7IVAug 14, 2020

Interpretation of Brain Morphology in Association to Alzheimer's Disease Dementia Classification Using Graph Convolutional Networks on Triangulated Meshes

Emanuel A. Azcona, Pierre Besson, Yunan Wu et al.

We propose a mesh-based technique to aid in the classification of Alzheimer's disease dementia (ADD) using mesh representations of the cortex and subcortical structures. Deep learning methods for classification tasks that utilize structural neuroimaging often require extensive learning parameters to optimize. Frequently, these approaches for automated medical diagnosis also lack visual interpretability for areas in the brain involved in making a diagnosis. This work: (a) analyzes brain shape using surface information of the cortex and subcortical structures, (b) proposes a residual learning framework for state-of-the-art graph convolutional networks which offer a significant reduction in learnable parameters, and (c) offers visual interpretability of the network via class-specific gradient information that localizes important regions of interest in our inputs. With our proposed method leveraging the use of cortical and subcortical surface information, we outperform other machine learning methods with a 96.35% testing accuracy for the ADD vs. healthy control problem. We confirm the validity of our model by observing its performance in a 25-trial Monte Carlo cross-validation. The generated visualization maps in our study show correspondences with current knowledge regarding the structural localization of pathological changes in the brain associated to dementia of the Alzheimer's type.

6.3CVOct 16, 2018

A Comparison of 1-D and 2-D Deep Convolutional Neural Networks in ECG Classification

Yunan Wu, Feng Yang, Ying Liu et al.

Effective detection of arrhythmia is an important task in the remote monitoring of electrocardiogram (ECG). The traditional ECG recognition depends on the judgment of the clinicians' experience, but the results suffer from the probability of human error due to the fatigue. To solve this problem, an ECG signal classification method based on the images is presented to classify ECG signals into normal and abnormal beats by using two-dimensional convolutional neural networks (2D-CNNs). First, we compare the accuracy and robustness between one-dimensional ECG signal input method and two-dimensional image input method in AlexNet network. Then, in order to alleviate the overfitting problem in two-dimensional network, we initialize AlexNet-like network with weights trained on ImageNet, to fit the training ECG images and fine-tune the model, and to further improve the accuracy and robustness of ECG classification. The performance evaluated on the MIT-BIH arrhythmia database demonstrates that the proposed method can achieve the accuracy of 98% and maintain high accuracy within SNR range from 20 dB to 35 dB. The experiment shows that the 2D-CNNs initialized with AlexNet weights performs better than one-dimensional signal method without a large-scale dataset.