IVJul 14, 2021
Multi-Attention Generative Adversarial Network for Remote Sensing Image Super-ResolutionMeng Xu, Zhihao Wang, Jiasong Zhu et al.
Image super-resolution (SR) methods can generate remote sensing images with high spatial resolution without increasing the cost, thereby providing a feasible way to acquire high-resolution remote sensing images, which are difficult to obtain due to the high cost of acquisition equipment and complex weather. Clearly, image super-resolution is a severe ill-posed problem. Fortunately, with the development of deep learning, the powerful fitting ability of deep neural networks has solved this problem to some extent. In this paper, we propose a network based on the generative adversarial network (GAN) to generate high resolution remote sensing images, named the multi-attention generative adversarial network (MA-GAN). We first designed a GAN-based framework for the image SR task. The core to accomplishing the SR task is the image generator with post-upsampling that we designed. The main body of the generator contains two blocks; one is the pyramidal convolution in the residual-dense block (PCRDB), and the other is the attention-based upsample (AUP) block. The attentioned pyramidal convolution (AttPConv) in the PCRDB block is a module that combines multi-scale convolution and channel attention to automatically learn and adjust the scaling of the residuals for better results. The AUP block is a module that combines pixel attention (PA) to perform arbitrary multiples of upsampling. These two blocks work together to help generate better quality images. For the loss function, we design a loss function based on pixel loss and introduce both adversarial loss and feature loss to guide the generator learning. We have compared our method with several state-of-the-art methods on a remote sensing scene image dataset, and the experimental results consistently demonstrate the effectiveness of the proposed MA-GAN.
CVNov 9, 2020
Deep Learning based Monocular Depth Prediction: Datasets, Methods and ApplicationsQing Li, Jiasong Zhu, Jun Liu et al.
Estimating depth from RGB images can facilitate many computer vision tasks, such as indoor localization, height estimation, and simultaneous localization and mapping (SLAM). Recently, monocular depth estimation has obtained great progress owing to the rapid development of deep learning techniques. They surpass traditional machine learning-based methods by a large margin in terms of accuracy and speed. Despite the rapid progress in this topic, there are lacking of a comprehensive review, which is needed to summarize the current progress and provide the future directions. In this survey, we first introduce the datasets for depth estimation, and then give a comprehensive introduction of the methods from three perspectives: supervised learning-based methods, unsupervised learning-based methods, and sparse samples guidance-based methods. In addition, downstream applications that benefit from the progress have also been illustrated. Finally, we point out the future directions and conclude the paper.
CVFeb 15, 2019
Enhancing Remote Sensing Image Retrieval with Triplet Deep Metric Learning NetworkRui Cao, Qian Zhang, Jiasong Zhu et al.
With the rapid growing of remotely sensed imagery data, there is a high demand for effective and efficient image retrieval tools to manage and exploit such data. In this letter, we present a novel content-based remote sensing image retrieval method based on Triplet deep metric learning convolutional neural network (CNN). By constructing a Triplet network with metric learning objective function, we extract the representative features of the images in a semantic space in which images from the same class are close to each other while those from different classes are far apart. In such a semantic space, simple metric measures such as Euclidean distance can be used directly to compare the similarity of images and effectively retrieve images of the same class. We also investigate a supervised and an unsupervised learning methods for reducing the dimensionality of the learned semantic features. We present comprehensive experimental results on two publicly available remote sensing image retrieval datasets and show that our method significantly outperforms state-of-the-art.
CVJan 4, 2019
Relative Geometry-Aware Siamese Neural Network for 6DOF Camera RelocalizationQing Li, Jiasong Zhu, Rui Cao et al.
6DOF camera relocalization is an important component of autonomous driving and navigation. Deep learning has recently emerged as a promising technique to tackle this problem. In this paper, we present a novel relative geometry-aware Siamese neural network to enhance the performance of deep learning-based methods through explicitly exploiting the relative geometry constraints between images. We perform multi-task learning and predict the absolute and relative poses simultaneously. We regularize the shared-weight twin networks in both the pose and feature domains to ensure that the estimated poses are globally as well as locally correct. We employ metric learning and design a novel adaptive metric distance loss to learn a feature that is capable of distinguishing poses of visually similar images from different locations. We evaluate the proposed method on public indoor and outdoor benchmarks and the experimental results demonstrate that our method can significantly improve localization performance. Furthermore, extensive ablation evaluations are conducted to demonstrate the effectiveness of different terms of the loss function.