Ljubiša Stanković

h-index53

15papers

9,501citations

Novelty39%

AI Score25

Ranked #164,031 of 194,257 authors (top 84%)#52,796 in CV (top 89%)

15 Papers

2.0CVAug 23, 2024Code

Perturbation on Feature Coalition: Towards Interpretable Deep Neural Networks

Xuran Hu, Mingzhe Zhu, Zhenpeng Feng et al.

The inherent "black box" nature of deep neural networks (DNNs) compromises their transparency and reliability. Recently, explainable AI (XAI) has garnered increasing attention from researchers. Several perturbation-based interpretations have emerged. However, these methods often fail to adequately consider feature dependencies. To solve this problem, we introduce a perturbation-based interpretation guided by feature coalitions, which leverages deep information of network to extract correlated features. Then, we proposed a carefully-designed consistency loss to guide network interpretation. Both quantitative and qualitative experiments are conducted to validate the effectiveness of our proposed method. Code is available at github.com/Teriri1999/Perturebation-on-Feature-Coalition.

1.5CVFeb 3, 2023

Cluster-CAM: Cluster-Weighted Visual Interpretation of CNNs' Decision in Image Classification

Zhenpeng Feng, Hongbing Ji, Milos Dakovic et al.

Despite the tremendous success of convolutional neural networks (CNNs) in computer vision, the mechanism of CNNs still lacks clear interpretation. Currently, class activation mapping (CAM), a famous visualization technique to interpret CNN's decision, has drawn increasing attention. Gradient-based CAMs are efficient while the performance is heavily affected by gradient vanishing and exploding. In contrast, gradient-free CAMs can avoid computing gradients to produce more understandable results. However, existing gradient-free CAMs are quite time-consuming because hundreds of forward interference per image are required. In this paper, we proposed Cluster-CAM, an effective and efficient gradient-free CNN interpretation algorithm. Cluster-CAM can significantly reduce the times of forward propagation by splitting the feature maps into clusters in an unsupervised manner. Furthermore, we propose an artful strategy to forge a cognition-base map and cognition-scissors from clustered feature maps. The final salience heatmap will be computed by merging the above cognition maps. Qualitative results conspicuously show that Cluster-CAM can produce heatmaps where the highlighted regions match the human's cognition more precisely than existing CAMs. The quantitative evaluation further demonstrates the superiority of Cluster-CAM in both effectiveness and efficiency.

2.6CVSep 15, 2022

VS-CAM: Vertex Semantic Class Activation Mapping to Interpret Vision Graph Neural Network

Zhenpeng Feng, Xiyang Cui, Hongbing Ji et al.

Graph convolutional neural network (GCN) has drawn increasing attention and attained good performance in various computer vision tasks, however, there lacks a clear interpretation of GCN's inner mechanism. For standard convolutional neural networks (CNNs), class activation mapping (CAM) methods are commonly used to visualize the connection between CNN's decision and image region by generating a heatmap. Nonetheless, such heatmap usually exhibits semantic-chaos when these CAMs are applied to GCN directly. In this paper, we proposed a novel visualization method particularly applicable to GCN, Vertex Semantic Class Activation Mapping (VS-CAM). VS-CAM includes two independent pipelines to produce a set of semantic-probe maps and a semantic-base map, respectively. Semantic-probe maps are used to detect the semantic information from semantic-base map to aggregate a semantic-aware heatmap. Qualitative results show that VS-CAM can obtain heatmaps where the highlighted regions match the objects much more precisely than CNN-based CAM. The quantitative evaluation further demonstrates the superiority of VS-CAM.

1.4CVOct 16, 2022

Demystifying CNNs for Images by Matched Filters

Shengxi Li, Xinyi Zhao, Ljubisa Stankovic et al.

The success of convolution neural networks (CNN) has been revolutionising the way we approach and use intelligent machines in the Big Data era. Despite success, CNNs have been consistently put under scrutiny owing to their \textit{black-box} nature, an \textit{ad hoc} manner of their construction, together with the lack of theoretical support and physical meanings of their operation. This has been prohibitive to both the quantitative and qualitative understanding of CNNs, and their application in more sensitive areas such as AI for health. We set out to address these issues, and in this way demystify the operation of CNNs, by employing the perspective of matched filtering. We first illuminate that the convolution operation, the very core of CNNs, represents a matched filter which aims to identify the presence of features in input data. This then serves as a vehicle to interpret the convolution-activation-pooling chain in CNNs under the theoretical umbrella of matched filtering, a common operation in signal processing. We further provide extensive examples and experiments to illustrate this connection, whereby the learning in CNNs is shown to also perform matched filtering, which further sheds light onto physical meaning of learnt parameters and layers. It is our hope that this material will provide new insights into the understanding, constructing and analysing of CNNs, as well as paving the way for developing new methods and architectures of CNNs.

2.0LGJan 12, 2023

Fair and skill-diverse student group formation via constrained k-way graph partitioning

Alexander Jenkins, Imad Jaimoukha, Ljubisa Stankovic et al.

Forming the right combination of students in a group promises to enable a powerful and effective environment for learning and collaboration. However, defining a group of students is a complex task which has to satisfy multiple constraints. This work introduces an unsupervised algorithm for fair and skill-diverse student group formation. This is achieved by taking account of student course marks and sensitive attributes provided by the education office. The skill sets of students are determined using unsupervised dimensionality reduction of course mark data via the Laplacian eigenmap. The problem is formulated as a constrained graph partitioning problem, whereby the diversity of skill sets in each group are maximised, group sizes are upper and lower bounded according to available resources, and `balance' of a sensitive attribute is lower bounded to enforce fairness in group formation. This optimisation problem is solved using integer programming and its effectiveness is demonstrated on a dataset of student course marks from Imperial College London.

4.7ITJun 19

SAR Despeckling via Region-Aware Sparse Representation and Statistical Noise Approximation

Xuran Hu, Mingzhe Zhu, Djordje Stanković et al.

Synthetic Aperture Radar (SAR) imagery are widely utilized in remote sensing due to their all-weather, all-day imaging capabilities. However, SAR images are highly susceptible to noise, particularly speckle noise, caused by the coherent imaging process, which severely degrades image quality. This has driven increasing research interest in SAR despeckling. Sparse representation-based methods have been extensively applied in natural image processing, yet SAR despeckling requires addressing non-Gaussian assumption and ensuring sparsity in the transform domain. In this work, we propose a simple, intuitive, and efficient SAR despeckling approach grounded in compressive sensing theory. By applying Log-Yeo-Johnson transformation, we convert gamma-distributed noise into an approximate Gaussian distribution to noise sparse assumption. The method incorporates noise and sparsity priors, leveraging a non-local sparse representation through auxiliary matrices: one capturing varying noise characteristics across regions and the other encoding adaptive sparsity information. Extensive experiments validate the effectiveness of our method.

2.0CVAug 2, 2024

Multi-task SAR Image Processing via GAN-based Unsupervised Manipulation

Xuran Hu, Mingzhe Zhu, Ziqiang Xu et al.

Generative Adversarial Networks (GANs) have shown tremendous potential in synthesizing a large number of realistic SAR images by learning patterns in the data distribution. Some GANs can achieve image editing by introducing latent codes, demonstrating significant promise in SAR image processing. Compared to traditional SAR image processing methods, editing based on GAN latent space control is entirely unsupervised, allowing image processing to be conducted without any labeled data. Additionally, the information extracted from the data is more interpretable. This paper proposes a novel SAR image processing framework called GAN-based Unsupervised Editing (GUE), aiming to address the following two issues: (1) disentangling semantic directions in the GAN latent space and finding meaningful directions; (2) establishing a comprehensive SAR image processing framework while achieving multiple image processing functions. In the implementation of GUE, we decompose the entangled semantic directions in the GAN latent space by training a carefully designed network. Moreover, we can accomplish multiple SAR image processing tasks (including despeckling, localization, auxiliary identification, and rotation editing) in a single training process without any form of supervision. Extensive experiments validate the effectiveness of the proposed method.

8.7CVJan 6, 2024

SAR Despeckling via Regional Denoising Diffusion Probabilistic Model

Xuran Hu, Ziqiang Xu, Zhihan Chen et al.

Speckle noise poses a significant challenge in maintaining the quality of synthetic aperture radar (SAR) images, so SAR despeckling techniques have drawn increasing attention. Despite the tremendous advancements of deep learning in fixed-scale SAR image despeckling, these methods still struggle to deal with large-scale SAR images. To address this problem, this paper introduces a novel despeckling approach termed Region Denoising Diffusion Probabilistic Model (R-DDPM) based on generative models. R-DDPM enables versatile despeckling of SAR images across various scales, accomplished within a single training session. Moreover, The artifacts in the fused SAR images can be avoided effectively with the utilization of region-guided inverse sampling. Experiments of our proposed R-DDPM on Sentinel-1 data demonstrates superior performance to existing methods.

5.8AIJan 6, 2024

Manifold-based Shapley for SAR Recognization Network Explanation

Xuran Hu, Mingzhe Zhu, Yuanjing Liu et al.

Explainable artificial intelligence (XAI) holds immense significance in enhancing the deep neural network's transparency and credibility, particularly in some risky and high-cost scenarios, like synthetic aperture radar (SAR). Shapley is a game-based explanation technique with robust mathematical foundations. However, Shapley assumes that model's features are independent, rendering Shapley explanation invalid for high dimensional models. This study introduces a manifold-based Shapley method by projecting high-dimensional features into low-dimensional manifold features and subsequently obtaining Fusion-Shap, which aims at (1) addressing the issue of erroneous explanations encountered by traditional Shap; (2) resolving the challenge of interpretability that traditional Shap faces in complex scenarios.

2.6LGJan 30, 2024

Widely Linear Matched Filter: A Lynchpin towards the Interpretability of Complex-valued CNNs

Qingchen Wang, Zhe Li, Zdenka Babic et al.

A recent study on the interpretability of real-valued convolutional neural networks (CNNs) {Stankovic_Mandic_2023CNN} has revealed a direct and physically meaningful link with the task of finding features in data through matched filters. However, applying this paradigm to illuminate the interpretability of complex-valued CNNs meets a formidable obstacle: the extension of matched filtering to a general class of noncircular complex-valued data, referred to here as the widely linear matched filter (WLMF), has been only implicit in the literature. To this end, to establish the interpretability of the operation of complex-valued CNNs, we introduce a general WLMF paradigm, provide its solution and undertake analysis of its performance. For rigor, our WLMF solution is derived without imposing any assumption on the probability density of noise. The theoretical advantages of the WLMF over its standard strictly linear counterpart (SLMF) are provided in terms of their output signal-to-noise-ratios (SNRs), with WLMF consistently exhibiting enhanced SNR. Moreover, the lower bound on the SNR gain of WLMF is derived, together with condition to attain this bound. This serves to revisit the convolution-activation-pooling chain in complex-valued CNNs through the lens of matched filtering, which reveals the potential of WLMFs to provide physical interpretability and enhance explainability of general complex-valued CNNs. Simulations demonstrate the agreement between the theoretical and numerical results.

1.4CVMay 26, 2022

Analytical Interpretation of Latent Codes in InfoGAN with SAR Images

Zhenpeng Feng, Milos Dakovic, Hongbing Ji et al.

Generative Adversarial Networks (GANs) can synthesize abundant photo-realistic synthetic aperture radar (SAR) images. Some recent GANs (e.g., InfoGAN), are even able to edit specific properties of the synthesized images by introducing latent codes. It is crucial for SAR image synthesis since the targets in real SAR images are with different properties due to the imaging mechanism. Despite the success of InfoGAN in manipulating properties, there still lacks a clear explanation of how these latent codes affect synthesized properties, thus editing specific properties usually relies on empirical trials, unreliable and time-consuming. In this paper, we show that latent codes are disentangled to affect the properties of SAR images in a non-linear manner. By introducing some property estimators for latent codes, we are able to provide a completely analytical nonlinear model to decompose the entangled causality between latent codes and different properties. The qualitative and quantitative experimental results further reveal that the properties can be calculated by latent codes, inversely, the satisfying latent codes can be estimated given desired properties. In this case, properties can be manipulated by latent codes as we expect.

4.3ITAug 26, 2021

Convolutional Neural Networks Demystified: A Matched Filtering Perspective Based Tutorial

Ljubisa Stankovic, Danilo Mandic

Deep Neural Networks (DNN) and especially Convolutional Neural Networks (CNN) are a de-facto standard for the analysis of large volumes of signals and images. Yet, their development and underlying principles have been largely performed in an ad-hoc and black box fashion. To help demystify CNNs, we revisit their operation from first principles and a matched filtering perspective. We establish that the convolution operation within CNNs, their very backbone, represents a matched filter which examines the input signal/image for the presence of pre-defined features. This perspective is shown to be physically meaningful, and serves as a basis for a step-by-step tutorial on the operation of CNNs, including pooling, zero padding, various ways of dimensionality reduction. Starting from first principles, both the feed-forward pass and the learning stage (via back-propagation) are illuminated in detail, both through a worked-out numerical example and the corresponding visualizations. It is our hope that this tutorial will help shed new light and physical intuition into the understanding and further development of deep neural networks.

4.4LGAug 23, 2021

Understanding the Basis of Graph Convolutional Neural Networks via an Intuitive Matched Filtering Approach

Ljubisa Stankovic, Danilo Mandic

Graph Convolutional Neural Networks (GCNN) are becoming a preferred model for data processing on irregular domains, yet their analysis and principles of operation are rarely examined due to the black box nature of NNs. To this end, we revisit the operation of GCNNs and show that their convolution layers effectively perform matched filtering of input data with the chosen patterns (features). This allows us to provide a unifying account of GCNNs through a matched filter perspective, whereby the nonlinear ReLU and max-pooling layers are also discussed within the matched filtering framework. This is followed by a step-by-step guide on information propagation and learning in GCNNs. It is also shown that standard CNNs and fully connected NNs can be obtained as a special case of GCNNs. A carefully chosen numerical example guides the reader through the various steps of GCNN operation and learning both visually and numerically.

1.2ITMar 11, 2021

Improved Coherence Index-Based Bound in Compressive Sensing

Ljubisa Stankovic, Milos Brajovic, Danilo Mandic et al.

Within the Compressive Sensing (CS) paradigm, sparse signals can be reconstructed based on a reduced set of measurements. Reliability of the solution is determined by the uniqueness condition. With its mathematically tractable and feasible calculation, coherence index is one of very few CS metrics with a considerable practical importance. In this paper, we propose an improvement of the coherence based uniqueness relation for the matching pursuit algorithms. Starting from a simple and intuitive derivation of the standard uniqueness condition based on the coherence index, we derive a less conservative coherence index-based lower bound for signal sparsity. The results are generalized to the uniqueness condition of the $l_0$-norm minimization for a signal represented in two orthonormal bases.

6.6ITJan 2, 2020

Graph Signal Processing -- Part III: Machine Learning on Graphs, from Graph Topology to Applications

Ljubisa Stankovic, Danilo Mandic, Milos Dakovic et al.

Many modern data analytics applications on graphs operate on domains where graph topology is not known a priori, and hence its determination becomes part of the problem definition, rather than serving as prior knowledge which aids the problem solution. Part III of this monograph starts by addressing ways to learn graph topology, from the case where the physics of the problem already suggest a possible topology, through to most general cases where the graph topology is learned from the data. A particular emphasis is on graph topology definition based on the correlation and precision matrices of the observed data, combined with additional prior knowledge and structural conditions, such as the smoothness or sparsity of graph connections. For learning sparse graphs (with small number of edges), the least absolute shrinkage and selection operator, known as LASSO is employed, along with its graph specific variant, graphical LASSO. For completeness, both variants of LASSO are derived in an intuitive way, and explained. An in-depth elaboration of the graph topology learning paradigm is provided through several examples on physically well defined graphs, such as electric circuits, linear heat transfer, social and computer networks, and spring-mass systems. As many graph neural networks (GNN) and convolutional graph networks (GCN) are emerging, we have also reviewed the main trends in GNNs and GCNs, from the perspective of graph signal filtering. Tensor representation of lattice-structured graphs is next considered, and it is shown that tensors (multidimensional data arrays) are a special class of graph signals, whereby the graph vertices reside on a high-dimensional regular lattice structure. This part of monograph concludes with two emerging applications in financial data processing and underground transportation networks modeling.