Pengbo Zhao

CV
3papers
136citations
Novelty48%
AI Score24

3 Papers

CVMar 17, 2022
Co-visual pattern augmented generative transformer learning for automobile geo-localization

Jianwei Zhao, Qiang Zhai, Pengbo Zhao et al.

Geolocation is a fundamental component of route planning and navigation for unmanned vehicles, but GNSS-based geolocation fails under denial-of-service conditions. Cross-view geo-localization (CVGL), which aims to estimate the geographical location of the ground-level camera by matching against enormous geo-tagged aerial (\emph{e.g.}, satellite) images, has received lots of attention but remains extremely challenging due to the drastic appearance differences across aerial-ground views. In existing methods, global representations of different views are extracted primarily using Siamese-like architectures, but their interactive benefits are seldom taken into account. In this paper, we present a novel approach using cross-view knowledge generative techniques in combination with transformers, namely mutual generative transformer learning (MGTL), for CVGL. Specifically, by taking the initial representations produced by the backbone network, MGTL develops two separate generative sub-modules -- one for aerial-aware knowledge generation from ground-view semantics and vice versa -- and fully exploits the entirely mutual benefits through the attention mechanism. Moreover, to better capture the co-visual relationships between aerial and ground views, we introduce a cascaded attention masking algorithm to further boost accuracy. Extensive experiments on challenging public benchmarks, \emph{i.e.}, {CVACT} and {CVUSA}, demonstrate the effectiveness of the proposed method which sets new records compared with the existing state-of-the-art models.

CVOct 17, 2020
PolarDet: A Fast, More Precise Detector for Rotated Target in Aerial Images

Pengbo Zhao, Zhenshen Qu, Yingjia Bu et al.

Fast and precise object detection for high-resolution aerial images has been a challenging task over the years. Due to the sharp variations on object scale, rotation, and aspect ratio, most existing methods are inefficient and imprecise. In this paper, we represent the oriented objects by polar method in polar coordinate and propose PolarDet, a fast and accurate one-stage object detector based on that representation. Our detector introduces a sub-pixel center semantic structure to further improve classifying veracity. PolarDet achieves nearly all SOTA performance in aerial object detection tasks with faster inference speed. In detail, our approach obtains the SOTA results on DOTA, UCAS-AOD, HRSC with 76.64\% mAP, 97.01\% mAP, and 90.46\% mAP respectively. Most noticeably, our PolarDet gets the best performance and reaches the fastest speed(32fps) at the UCAS-AOD dataset.

CVMar 17, 2020
Deep Active Learning for Remote Sensing Object Detection

Zhenshen Qu, Jingda Du, Yong Cao et al.

Recently, CNN object detectors have achieved high accuracy on remote sensing images but require huge labor and time costs on annotation. In this paper, we propose a new uncertainty-based active learning which can select images with more information for annotation and detector can still reach high performance with a fraction of the training images. Our method not only analyzes objects' classification uncertainty to find least confident objects but also considers their regression uncertainty to declare outliers. Besides, we bring out two extra weights to overcome two difficulties in remote sensing datasets, class-imbalance and difference in images' objects amount. We experiment our active learning algorithm on DOTA dataset with CenterNet as object detector. We achieve same-level performance as full supervision with only half images. We even override full supervision with 55% images and augmented weights on least confident images.