Mihai Datcu

CV
h-index17
23papers
4,854citations
Novelty35%
AI Score30

23 Papers

CVSep 29, 2022
Guided Unsupervised Learning by Subaperture Decomposition for Ocean SAR Image Retrieval

Nicolae-Cătălin Ristea, Andrei Anghel, Mihai Datcu et al.

Spaceborne synthetic aperture radar (SAR) can provide accurate images of the ocean surface roughness day-or-night in nearly all weather conditions, being an unique asset for many geophysical applications. Considering the huge amount of data daily acquired by satellites, automated techniques for physical features extraction are needed. Even if supervised deep learning methods attain state-of-the-art results, they require great amount of labeled data, which are difficult and excessively expensive to acquire for ocean SAR imagery. To this end, we use the subaperture decomposition (SD) algorithm to enhance the unsupervised learning retrieval on the ocean surface, empowering ocean researchers to search into large ocean databases. We empirically prove that SD improve the retrieval precision with over 20% for an unsupervised transformer auto-encoder network. Moreover, we show that SD brings important performance boost when Doppler centroid images are used as input data, leading the way to new unsupervised physics guided retrieval algorithms.

CVApr 9, 2022
Guided deep learning by subaperture decomposition: ocean patterns from SAR imagery

Nicolae-Catalin Ristea, Andrei Anghel, Mihai Datcu et al.

Spaceborne synthetic aperture radar can provide meters scale images of the ocean surface roughness day or night in nearly all weather conditions. This makes it a unique asset for many geophysical applications. Sentinel 1 SAR wave mode vignettes have made possible to capture many important oceanic and atmospheric phenomena since 2014. However, considering the amount of data provided, expanding applications requires a strategy to automatically process and extract geophysical parameters. In this study, we propose to apply subaperture decomposition as a preprocessing stage for SAR deep learning models. Our data centring approach surpassed the baseline by 0.7, obtaining state of the art on the TenGeoPSARwv data set. In addition, we empirically showed that subaperture decomposition could bring additional information over the original vignette, by rising the number of clusters for an unsupervised segmentation method. Overall, we encourage the development of data centring approaches, showing that, data preprocessing could bring significant performance improvements over existing deep learning models.

CVJun 13, 2023
Sea Ice Segmentation From SAR Data by Convolutional Transformer Networks

Nicolae-Catalin Ristea, Andrei Anghel, Mihai Datcu

Sea ice is a crucial component of the Earth's climate system and is highly sensitive to changes in temperature and atmospheric conditions. Accurate and timely measurement of sea ice parameters is important for understanding and predicting the impacts of climate change. Nevertheless, the amount of satellite data acquired over ice areas is huge, making the subjective measurements ineffective. Therefore, automated algorithms must be used in order to fully exploit the continuous data feeds coming from satellites. In this paper, we present a novel approach for sea ice segmentation based on SAR satellite imagery using hybrid convolutional transformer (ConvTr) networks. We show that our approach outperforms classical convolutional networks, while being considerably more efficient than pure transformer models. ConvTr obtained a mean intersection over union (mIoU) of 63.68% on the AI4Arctic data set, assuming an inference time of 120ms for a 400 x 400 squared km product.

QUANT-PHApr 10, 2022
Coreset of Hyperspectral Images on Small Quantum Computer

Soronzonbold Otgonbaatar, Mihai Datcu, Begüm Demir

Machine Learning (ML) techniques are employed to analyze and process big Remote Sensing (RS) data, and one well-known ML technique is a Support Vector Machine (SVM). An SVM is a quadratic programming (QP) problem, and a D-Wave quantum annealer (D-Wave QA) promises to solve this QP problem more efficiently than a conventional computer. However, the D-Wave QA cannot solve directly the SVM due to its very few input qubits. Hence, we use a coreset ("core of a dataset") of given EO data for training an SVM on this small D-Wave QA. The coreset is a small, representative weighted subset of an original dataset, and any training models generate competitive classes by using the coreset in contrast to by using its original dataset. We measured the closeness between an original dataset and its coreset by employing a Kullback-Leibler (KL) divergence measure. Moreover, we trained the SVM on the coreset data by using both a D-Wave QA and a conventional method. We conclude that the coreset characterizes the original dataset with very small KL divergence measure. In addition, we present our KL divergence results for demonstrating the closeness between our original data and its coreset. As practical RS data, we use Hyperspectral Image (HSI) of Indian Pine, USA.

IVJan 9, 2023
Explainable, Physics Aware, Trustworthy AI Paradigm Shift for Synthetic Aperture Radar

Mihai Datcu, Zhongling Huang, Andrei Anghel et al.

The recognition or understanding of the scenes observed with a SAR system requires a broader range of cues, beyond the spatial context. These encompass but are not limited to: imaging geometry, imaging mode, properties of the Fourier spectrum of the images or the behavior of the polarimetric signatures. In this paper, we propose a change of paradigm for explainability in data science for the case of Synthetic Aperture Radar (SAR) data to ground the explainable AI for SAR. It aims to use explainable data transformations based on well-established models to generate inputs for AI methods, to provide knowledgeable feedback for training process, and to learn or improve high-complexity unknown or un-formalized models from the data. At first, we introduce a representation of the SAR system with physical layers: i) instrument and platform, ii) imaging formation, iii) scattering signatures and objects, that can be integrated with an AI model for hybrid modeling. Successively, some illustrative examples are presented to demonstrate how to achieve hybrid modeling for SAR image understanding. The perspective of trustworthy model and supplementary explanations are discussed later. Finally, we draw the conclusion and we deem the proposed concept has applicability to the entire class of coherent imaging sensors and other computational imaging systems.

CVOct 28, 2022
Deep Learning-Based Anomaly Detection in Synthetic Aperture Radar Imaging

Max Muzeau, Chengfang Ren, Sébastien Angelliaume et al.

In this paper, we proposed to investigate unsupervised anomaly detection in Synthetic Aperture Radar (SAR) images. Our approach considers anomalies as abnormal patterns that deviate from their surroundings but without any prior knowledge of their characteristics. In the literature, most model-based algorithms face three main issues. First, the speckle noise corrupts the image and potentially leads to numerous false detections. Second, statistical approaches may exhibit deficiencies in modeling spatial correlation in SAR images. Finally, neural networks based on supervised learning approaches are not recommended due to the lack of annotated SAR data, notably for the class of abnormal patterns. Our proposed method aims to address these issues through a self-supervised algorithm. The speckle is first removed through the deep learning SAR2SAR algorithm. Then, an adversarial autoencoder is trained to reconstruct an anomaly-free SAR image. Finally, a change detection processing step is applied between the input and the output to detect anomalies. Experiments are performed to show the advantages of our method compared to the conventional Reed-Xiaoli algorithm, highlighting the importance of an efficient despeckling pre-processing step.

CVNov 5, 2024Code
Generative Artificial Intelligence Meets Synthetic Aperture Radar: A Survey

Zhongling Huang, Xidan Zhang, Zuqian Tang et al.

SAR images possess unique attributes that present challenges for both human observers and vision AI models to interpret, owing to their electromagnetic characteristics. The interpretation of SAR images encounters various hurdles, with one of the primary obstacles being the data itself, which includes issues related to both the quantity and quality of the data. The challenges can be addressed using generative AI technologies. Generative AI, often known as GenAI, is a very advanced and powerful technology in the field of artificial intelligence that has gained significant attention. The advancement has created possibilities for the creation of texts, photorealistic pictures, videos, and material in various modalities. This paper aims to comprehensively investigate the intersection of GenAI and SAR. First, we illustrate the common data generation-based applications in SAR field and compare them with computer vision tasks, analyzing the similarity, difference, and general challenges of them. Then, an overview of the latest GenAI models is systematically reviewed, including various basic models and their variations targeting the general challenges. Additionally, the corresponding applications in SAR domain are also included. Specifically, we propose to summarize the physical model based simulation approaches for SAR, and analyze the hybrid modeling methods that combine the GenAI and interpretable models. The evaluation methods that have been or could be applied to SAR, are also explored. Finally, the potential challenges and future prospects are discussed. To our best knowledge, this survey is the first exhaustive examination of the interdiscipline of SAR and GenAI, encompassing a wide range of topics, including deep neural networks, physical models, computer vision, and SAR images. The resources of this survey are open-source at \url{https://github.com/XAI4SAR/GenAIxSAR}.

CVMay 15, 2023
Artificial intelligence to advance Earth observation: : A review of models, recent trends, and pathways forward

Devis Tuia, Konrad Schindler, Begüm Demir et al.

Earth observation (EO) is a prime instrument for monitoring land and ocean processes, studying the dynamics at work, and taking the pulse of our planet. This article gives a bird's eye view of the essential scientific tools and approaches informing and supporting the transition from raw EO data to usable EO-based information. The promises, as well as the current challenges of these developments, are highlighted under dedicated sections. Specifically, we cover the impact of (i) Computer vision; (ii) Machine learning; (iii) Advanced processing and computing; (iv) Knowledge-based AI; (v) Explainable AI and causal inference; (vi) Physics-aware models; (vii) User-centric approaches; and (viii) the much-needed discussion of ethical and societal issues related to the massive use of ML technologies in EO.

IVOct 27, 2021
Physically Explainable CNN for SAR Image Classification

Zhongling Huang, Xiwen Yao, Ying Liu et al.

Integrating the special electromagnetic characteristics of Synthetic Aperture Radar (SAR) in deep neural networks is essential in order to enhance the explainability and physics awareness of deep learning. In this paper, we first propose a novel physically explainable convolutional neural network for SAR image classification, namely physics guided and injected learning (PGIL). It comprises three parts: (1) explainable models (XM) to provide prior physics knowledge, (2) physics guided network (PGN) to encode the knowledge into physics-aware features, and (3) physics injected network (PIN) to adaptively introduce the physics-aware features into classification pipeline for label prediction. A hybrid Image-Physics SAR dataset format is proposed for evaluation, with both Sentinel-1 and Gaofen-3 SAR data being experimented. The results show that the proposed PGIL substantially improve the classification performance in case of limited labeled data compared with the counterpart data-driven CNN and other pre-training methods. Additionally, the physics explanations are discussed to indicate the interpretability and the physical consistency preserved in the predictions. We deem the proposed method would promote the development of physically explainable deep learning in SAR image interpretation field.

CVAug 30, 2021
LUAI Challenge 2021 on Learning to Understand Aerial Images

Gui-Song Xia, Jian Ding, Ming Qian et al.

This report summarizes the results of Learning to Understand Aerial Images (LUAI) 2021 challenge held on ICCV 2021, which focuses on object detection and semantic segmentation in aerial images. Using DOTA-v2.0 and GID-15 datasets, this challenge proposes three tasks for oriented object detection, horizontal object detection, and semantic segmentation of common categories in aerial images. This challenge received a total of 146 registrations on the three tasks. Through the challenge, we hope to draw attention from a wide range of communities and call for more efforts on the problems of learning to understand aerial images.

CVApr 20, 2021
CrossATNet - A Novel Cross-Attention Based Framework for Sketch-Based Image Retrieval

Ushasi Chaudhuri, Biplab Banerjee, Avik Bhattacharya et al.

We propose a novel framework for cross-modal zero-shot learning (ZSL) in the context of sketch-based image retrieval (SBIR). Conventionally, the SBIR schema mainly considers simultaneous mappings among the two image views and the semantic side information. Therefore, it is desirable to consider fine-grained classes mainly in the sketch domain using highly discriminative and semantically rich feature space. However, the existing deep generative modeling-based SBIR approaches majorly focus on bridging the gaps between the seen and unseen classes by generating pseudo-unseen-class samples. Besides, violating the ZSL protocol by not utilizing any unseen-class information during training, such techniques do not pay explicit attention to modeling the discriminative nature of the shared space. Also, we note that learning a unified feature space for both the multi-view visual data is a tedious task considering the significant domain difference between sketches and color images. In this respect, as a remedy, we introduce a novel framework for zero-shot SBIR. While we define a cross-modal triplet loss to ensure the discriminative nature of the shared space, an innovative cross-modal attention learning strategy is also proposed to guide feature extraction from the image domain exploiting information from the respective sketch counterpart. In order to preserve the semantic consistency of the shared space, we consider a graph CNN-based module that propagates the semantic class topology to the shared space. To ensure an improved response time during inference, we further explore the possibility of representing the shared space in terms of hash codes. Experimental results obtained on the benchmark TU-Berlin and the Sketchy datasets confirm the superiority of CrossATNet in yielding state-of-the-art results.

CVFeb 24, 2021
Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges

Jian Ding, Nan Xue, Gui-Song Xia et al.

In the past decade, object detection has achieved significant progress in natural images but not in aerial images, due to the massive variations in the scale and orientation of objects caused by the bird's-eye view of aerial images. More importantly, the lack of large-scale benchmarks has become a major obstacle to the development of object detection in aerial images (ODAI). In this paper,we present a large-scale Dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI. The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images. Based on this large-scale and well-annotated dataset, we build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated. Furthermore, we provide a code library for ODAI and build a website for evaluating different algorithms. Previous challenges run on DOTA have attracted more than 1300 teams worldwide. We believe that the expanded large-scale DOTA dataset, the extensive baselines, the code library and the challenges can facilitate the designs of robust algorithms and reproducible research on the problem of object detection in aerial images.

CVAug 12, 2020
A Zero-Shot Sketch-based Inter-Modal Object Retrieval Scheme for Remote Sensing Images

Ushasi Chaudhuri, Biplab Banerjee, Avik Bhattacharya et al.

Conventional existing retrieval methods in remote sensing (RS) are often based on a uni-modal data retrieval framework. In this work, we propose a novel inter-modal triplet-based zero-shot retrieval scheme utilizing a sketch-based representation of RS data. The proposed scheme performs efficiently even when the sketch representations are marginally prototypical of the image. We conducted experiments on a new bi-modal image-sketch dataset called Earth on Canvas (EoC) conceived during this study. We perform a thorough bench-marking of this dataset and demonstrate that the proposed network outperforms other state-of-the-art methods for zero-shot sketch-based retrieval framework in remote sensing.

SPJan 6, 2020
Classification of Large-Scale High-Resolution SAR Images with Deep Transfer Learning

Zhongling Huang, Corneliu Octavian Dumitru, Zongxu Pan et al.

The classification of large-scale high-resolution SAR land cover images acquired by satellites is a challenging task, facing several difficulties such as semantic annotation with expertise, changing data characteristics due to varying imaging parameters or regional target area differences, and complex scattering mechanisms being different from optical imaging. Given a large-scale SAR land cover dataset collected from TerraSAR-X images with a hierarchical three-level annotation of 150 categories and comprising more than 100,000 patches, three main challenges in automatically interpreting SAR images of highly imbalanced classes, geographic diversity, and label noise are addressed. In this letter, a deep transfer learning method is proposed based on a similarly annotated optical land cover dataset (NWPU-RESISC45). Besides, a top-2 smooth loss function with cost-sensitive parameters was introduced to tackle the label noise and imbalanced classes' problems. The proposed method shows high efficiency in transferring information from a similarly annotated remote sensing dataset, a robust performance on highly imbalanced classes, and is alleviating the over-fitting problem caused by label noise. What's more, the learned deep model has a good generalization for other SAR-specific tasks, such as MSTAR target recognition with a state-of-the-art classification accuracy of 99.46%.

IVApr 9, 2019
CMIR-NET : A Deep Learning Based Model For Cross-Modal Retrieval In Remote Sensing

Ushasi Chaudhuri, Biplab Banerjee, Avik Bhattacharya et al.

We address the problem of cross-modal information retrieval in the domain of remote sensing. In particular, we are interested in two application scenarios: i) cross-modal retrieval between panchromatic (PAN) and multi-spectral imagery, and ii) multi-label image retrieval between very high resolution (VHR) images and speech based label annotations. Notice that these multi-modal retrieval scenarios are more challenging than the traditional uni-modal retrieval approaches given the inherent differences in distributions between the modalities. However, with the growing availability of multi-source remote sensing data and the scarcity of enough semantic annotations, the task of multi-modal retrieval has recently become extremely important. In this regard, we propose a novel deep neural network based architecture which is considered to learn a discriminative shared feature space for all the input modalities, suitable for semantically coherent information retrieval. Extensive experiments are carried out on the benchmark large-scale PAN - multi-spectral DSRSID dataset and the multi-label UC-Merced dataset. Together with the Merced dataset, we generate a corpus of speech signals corresponding to the labels. Superior performance with respect to the current state-of-the-art is observed in all the cases.

IVJul 20, 2018
Dialectical GAN for SAR Image Translation: From Sentinel-1 to TerraSAR-X

Dongyang Ao, Corneliu Octavian Dumitru, Gottfried Schwarz et al.

Contrary to optical images, Synthetic Aperture Radar (SAR) images are in different electromagnetic spectrum where the human visual system is not accustomed to. Thus, with more and more SAR applications, the demand for enhanced high-quality SAR images has increased considerably. However, high-quality SAR images entail high costs due to the limitations of current SAR devices and their image processing resources. To improve the quality of SAR images and to reduce the costs of their generation, we propose a Dialectical Generative Adversarial Network (Dialectical GAN) to generate high-quality SAR images. This method is based on the analysis of hierarchical SAR information and the "dialectical" structure of GAN frameworks. As a demonstration, a typical example will be shown where a low-resolution SAR image (e.g., a Sentinel-1 image) with large ground coverage is translated into a high-resolution SAR image (e.g., a TerraSAR-X image). Three traditional algorithms are compared, and a new algorithm is proposed based on a network framework by combining conditional WGAN-GP (Wasserstein Generative Adversarial Network - Gradient Penalty) loss functions and Spatial Gram matrices under the rule of dialectics. Experimental results show that the SAR image translation works very well when we compare the results of our proposed method with the selected traditional methods.

CVNov 28, 2017
DOTA: A Large-scale Dataset for Object Detection in Aerial Images

Gui-Song Xia, Xiang Bai, Jian Ding et al.

Object detection is an important and challenging problem in computer vision. Although the past decade has witnessed major advances in object detection in natural scenes, such successes have been slow to aerial imagery, not only because of the huge variation in the scale, orientation and shape of the object instances on the earth's surface, but also due to the scarcity of well-annotated datasets of objects in aerial scenes. To advance object detection research in Earth Vision, also known as Earth Observation and Remote Sensing, we introduce a large-scale Dataset for Object deTection in Aerial images (DOTA). To this end, we collect $2806$ aerial images from different sensors and platforms. Each image is of the size about 4000-by-4000 pixels and contains objects exhibiting a wide variety of scales, orientations, and shapes. These DOTA images are then annotated by experts in aerial image interpretation using $15$ common object categories. The fully annotated DOTA images contains $188,282$ instances, each of which is labeled by an arbitrary (8 d.o.f.) quadrilateral To build a baseline for object detection in Earth Vision, we evaluate state-of-the-art object detection algorithms on DOTA. Experiments demonstrate that DOTA well represents real Earth Vision applications and are quite challenging.

CVNov 6, 2017
Artificial Generation of Big Data for Improving Image Classification: A Generative Adversarial Network Approach on SAR Data

Dimitrios Marmanis, Wei Yao, Fathalrahman Adam et al.

Very High Spatial Resolution (VHSR) large-scale SAR image databases are still an unresolved issue in the Remote Sensing field. In this work, we propose such a dataset and use it to explore patch-based classification in urban and periurban areas, considering 7 distinct semantic classes. In this context, we investigate the accuracy of large CNN classification models and pre-trained networks for SAR imaging systems. Furthermore, we propose a Generative Adversarial Network (GAN) for SAR image generation and test, whether the synthetic data can actually improve classification accuracy.

CVJul 23, 2017
Exploiting Deep Features for Remote Sensing Image Retrieval: A Systematic Investigation

Xin-Yi Tong, Gui-Song Xia, Fan Hu et al.

Remote sensing (RS) image retrieval is of great significant for geological information mining. Over the past two decades, a large amount of research on this task has been carried out, which mainly focuses on the following three core issues: feature extraction, similarity metric and relevance feedback. Due to the complexity and multiformity of ground objects in high-resolution remote sensing (HRRS) images, there is still room for improvement in the current retrieval approaches. In this paper, we analyze the three core issues of RS image retrieval and provide a comprehensive review on existing methods. Furthermore, for the goal to advance the state-of-the-art in HRRS image retrieval, we focus on the feature extraction issue and delve how to use powerful deep representations to address this task. We conduct systematic investigation on evaluating correlative factors that may affect the performance of deep features. By optimizing each factor, we acquire remarkable retrieval results on publicly available HRRS datasets. Finally, we explain the experimental phenomenon in detail and draw conclusions according to our analysis. Our work can serve as a guiding role for the research of content-based RS image retrieval.

CVDec 5, 2016
Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection

Dimitrios Marmanis, Konrad Schindler, Jan Dirk Wegner et al.

We present an end-to-end trainable deep convolutional neural network (DCNN) for semantic segmentation with built-in awareness of semantically meaningful boundaries. Semantic segmentation is a fundamental remote sensing task, and most state-of-the-art methods rely on DCNNs as their workhorse. A major reason for their success is that deep networks learn to accumulate contextual information over very large windows (receptive fields). However, this success comes at a cost, since the associated loss of effecive spatial resolution washes out high-frequency details and leads to blurry object boundaries. Here, we propose to counter this effect by combining semantic segmentation with semantically informed edge detection, thus making class-boundaries explicit in the model, First, we construct a comparatively simple, memory-efficient model by adding boundary detection to the Segnet encoder-decoder architecture. Second, we also include boundary detection in FCN-type models and set up a high-end classifier ensemble. We show that boundary detection significantly improves semantic segmentation with CNNs. Our high-end ensemble achieves > 90% overall accuracy on the ISPRS Vaihingen benchmark.

CLFeb 14, 2014
Authorship Analysis based on Data Compression

Daniele Cerra, Mihai Datcu, Peter Reinartz

This paper proposes to perform authorship analysis using the Fast Compression Distance (FCD), a similarity measure based on compression with dictionaries directly extracted from the written texts. The FCD computes a similarity between two documents through an effective binary search on the intersection set between the two related dictionaries. In the reported experiments the proposed method is applied to documents which are heterogeneous in style, written in five different languages and coming from different historical periods. Results are comparable to the state of the art and outperform traditional compression-based methods.

IRJul 4, 2013
Further results on dissimilarity spaces for hyperspectral images RF-CBIR

Miguel Angel Veganzones, Mihai Datcu, Manuel Graña

Content-Based Image Retrieval (CBIR) systems are powerful search tools in image databases that have been little applied to hyperspectral images. Relevance feedback (RF) is an iterative process that uses machine learning techniques and user's feedback to improve the CBIR systems performance. We pursued to expand previous research in hyperspectral CBIR systems built on dissimilarity functions defined either on spectral and spatial features extracted by spectral unmixing techniques, or on dictionaries extracted by dictionary-based compressors. These dissimilarity functions were not suitable for direct application in common machine learning techniques. We propose to use a RF general approach based on dissimilarity spaces which is more appropriate for the application of machine learning algorithms to the hyperspectral RF-CBIR. We validate the proposed RF method for hyperspectral CBIR systems over a real hyperspectral dataset.

MLOct 2, 2012
A fast compression-based similarity measure with applications to content-based image retrieval

Daniele Cerra, Mihai Datcu

Compression-based similarity measures are effectively employed in applications on diverse data types with a basically parameter-free approach. Nevertheless, there are problems in applying these techniques to medium-to-large datasets which have been seldom addressed. This paper proposes a similarity measure based on compression with dictionaries, the Fast Compression Distance (FCD), which reduces the complexity of these methods, without degradations in performance. On its basis a content-based color image retrieval system is defined, which can be compared to state-of-the-art methods based on invariant color features. Through the FCD a better understanding of compression-based techniques is achieved, by performing experiments on datasets which are larger than the ones analyzed so far in literature.