Dawei Zhao

h-index28

8papers

347citations

Novelty42%

AI Score30

Ranked #136,220 of 194,257 authors (top 70%)#44,891 in CV (top 76%)

8 Papers

24.1CVJun 20, 2022Code

Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders

Chen Min, Xinli Xu, Dawei Zhao et al.

Current perception models in autonomous driving heavily rely on large-scale labelled 3D data, which is both costly and time-consuming to annotate. This work proposes a solution to reduce the dependence on labelled 3D training data by leveraging pre-training on large-scale unlabeled outdoor LiDAR point clouds using masked autoencoders (MAE). While existing masked point autoencoding methods mainly focus on small-scale indoor point clouds or pillar-based large-scale outdoor LiDAR data, our approach introduces a new self-supervised masked occupancy pre-training method called Occupancy-MAE, specifically designed for voxel-based large-scale outdoor LiDAR point clouds. Occupancy-MAE takes advantage of the gradually sparse voxel occupancy structure of outdoor LiDAR point clouds and incorporates a range-aware random masking strategy and a pretext task of occupancy prediction. By randomly masking voxels based on their distance to the LiDAR and predicting the masked occupancy structure of the entire 3D surrounding scene, Occupancy-MAE encourages the extraction of high-level semantic information to reconstruct the masked voxel using only a small number of visible voxels. Extensive experiments demonstrate the effectiveness of Occupancy-MAE across several downstream tasks. For 3D object detection, Occupancy-MAE reduces the labelled data required for car detection on the KITTI dataset by half and improves small object detection by approximately 2% in AP on the Waymo dataset. For 3D semantic segmentation, Occupancy-MAE outperforms training from scratch by around 2% in mIoU. For multi-object tracking, Occupancy-MAE enhances training from scratch by approximately 1% in terms of AMOTA and AMOTP. Codes are publicly available at https://github.com/chaytonmin/Occupancy-MAE.

17.8CVAug 14, 2023Code

UniWorld: Autonomous Driving Pre-training via World Models

Chen Min, Dawei Zhao, Liang Xiao et al.

In this paper, we draw inspiration from Alberto Elfes' pioneering work in 1989, where he introduced the concept of the occupancy grid as World Models for robots. We imbue the robot with a spatial-temporal world model, termed UniWorld, to perceive its surroundings and predict the future behavior of other participants. UniWorld involves initially predicting 4D geometric occupancy as the World Models for foundational stage and subsequently fine-tuning on downstream tasks. UniWorld can estimate missing information concerning the world state and predict plausible future states of the world. Besides, UniWorld's pre-training process is label-free, enabling the utilization of massive amounts of image-LiDAR pairs to build a Foundational Model.The proposed unified pre-training framework demonstrates promising results in key tasks such as motion prediction, multi-camera 3D object detection, and surrounding semantic scene completion. When compared to monocular pre-training methods on the nuScenes dataset, UniWorld shows a significant improvement of about 1.5% in IoU for motion prediction, 2.0% in mAP and 2.0% in NDS for multi-camera 3D object detection, as well as a 3% increase in mIoU for surrounding semantic scene completion. By adopting our unified pre-training method, a 25% reduction in 3D training annotation costs can be achieved, offering significant practical value for the implementation of real-world autonomous driving. Codes are publicly available at https://github.com/chaytonmin/UniWorld.

2.6CVNov 13, 2022

Adversarial and Random Transformations for Robust Domain Adaptation and Generalization

Liang Xiao, Jiaolong Xu, Dawei Zhao et al. · berkeley

Data augmentation has been widely used to improve generalization in training deep neural networks. Recent works show that using worst-case transformations or adversarial augmentation strategies can significantly improve the accuracy and robustness. However, due to the non-differentiable properties of image transformations, searching algorithms such as reinforcement learning or evolution strategy have to be applied, which are not computationally practical for large scale problems. In this work, we show that by simply applying consistency training with random data augmentation, state-of-the-art results on domain adaptation (DA) and generalization (DG) can be obtained. To further improve the accuracy and robustness with adversarial examples, we propose a differentiable adversarial data augmentation method based on spatial transformer networks (STN). The combined adversarial and random transformations based method outperforms the state-of-the-art on multiple DA and DG benchmark datasets. Besides, the proposed method shows desirable robustness to corruption, which is also validated on commonly used datasets.

13.6CVMay 30, 2023Code

UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving

Chen Min, Liang Xiao, Dawei Zhao et al.

Multi-camera 3D perception has emerged as a prominent research field in autonomous driving, offering a viable and cost-effective alternative to LiDAR-based solutions. The existing multi-camera algorithms primarily rely on monocular 2D pre-training. However, the monocular 2D pre-training overlooks the spatial and temporal correlations among the multi-camera system. To address this limitation, we propose the first multi-camera unified pre-training framework, called UniScene, which involves initially reconstructing the 3D scene as the foundational stage and subsequently fine-tuning the model on downstream tasks. Specifically, we employ Occupancy as the general representation for the 3D scene, enabling the model to grasp geometric priors of the surrounding world through pre-training. A significant benefit of UniScene is its capability to utilize a considerable volume of unlabeled image-LiDAR pairs for pre-training purposes. The proposed multi-camera unified pre-training framework demonstrates promising results in key tasks such as multi-camera 3D object detection and surrounding semantic scene completion. When compared to monocular pre-training methods on the nuScenes dataset, UniScene shows a significant improvement of about 2.0% in mAP and 2.0% in NDS for multi-camera 3D object detection, as well as a 3% increase in mIoU for surrounding semantic scene completion. By adopting our unified pre-training method, a 25% reduction in 3D training annotation costs can be achieved, offering significant practical value for the implementation of real-world autonomous driving. Codes are publicly available at https://github.com/chaytonmin/UniScene.

4.7CVMay 9, 2021Code

Trajectory Prediction for Autonomous Driving with Topometric Map

Jiaolong Xu, Liang Xiao, Dawei Zhao et al.

State-of-the-art autonomous driving systems rely on high definition (HD) maps for localization and navigation. However, building and maintaining HD maps is time-consuming and expensive. Furthermore, the HD maps assume structured environment such as the existence of major road and lanes, which are not present in rural areas. In this work, we propose an end-to-end transformer networks based approach for map-less autonomous driving. The proposed model takes raw LiDAR data and noisy topometric map as input and produces precise local trajectory for navigation. We demonstrate the effectiveness of our method in real-world driving data, including both urban and rural areas. The experimental results show that the proposed method outperforms state-of-the-art multimodal methods and is robust to the perturbations of the topometric map. The code of the proposed method is publicly available at \url{https://github.com/Jiaolong/trajectory-prediction}.

1.2SISep 10, 2015

The robustness of multiplex networks under layer node-based attack

Dawei Zhao, Lianhai Wang, Zhen Wang

From transportation networks to complex infrastructures, and to social and economic networks, a large variety of systems can be described in terms of multiplex networks formed by a set of nodes interacting through different network layers. Network robustness, as one of the most successful application areas of complex networks, has also attracted great interest in both theoretical and empirical researches. However, the vast majority of existing researches mainly focus on the robustness of single-layer networks an interdependent networks, how multiplex networks respond to potential attack is still short of further exploration. Here we study the robustness of multiplex networks under two attack strategies: layer node-based random attack and layer node-based targeted attack. A theoretical analysis framework is proposed to calculate the critical threshold and the size of giant component of multiplex networks when a fraction of layer nodes are removed randomly or intentionally. Via numerous simulations, it is unveiled that the theoretical method can accurately predict the threshold and the size of giant component, irrespective of attack strategies. Moreover, we also compare the robustness of multiplex networks under multiplex node-based attack and layer node-based attack, and find that layer node-based attack makes multiplex networks more vulnerable, regardless of average degree and underlying topology. Our finding may shed new light on the protection of multiplex networks.

3.7CRJun 20, 2013

A secure and effective anonymous authentication scheme for roaming service in global mobility networks

Dawei Zhao, Haipeng Peng, Lixiang Li et al.

Recently, Mun et al. analyzed Wu et al.'s authentication scheme and proposed a enhanced anonymous authentication scheme for roaming service in global mobility networks. However, through careful analysis, we find that Mun et al.'s scheme is vulnerable to impersonation attacks, off-line password guessing attacks and insider attacks, and cannot provide user friendliness, user's anonymity, proper mutual authentication and local verification. To remedy these weaknesses, in this paper we propose a novel anonymous authentication scheme for roaming service in global mobility networks. Security and performance analyses show the proposed scheme is more suitable for the low-power and resource-limited mobile devices, and is secure against various attacks and has many excellent features.

3.7CRMay 28, 2013

An efficient dynamic ID based remote user authentication scheme using self-certified public keys for multi-server environment

Dawei Zhao, Haipeng Peng, Shudong Li et al.

Recently, Li et al. analyzed Lee et al.'s multi-server authentication scheme and proposed a novel smart card and dynamic ID based remote user authentication scheme for multi-server environments. They claimed that their scheme can resist several kinds of attacks. However, through careful analysis, we find that Li et al.'s scheme is vulnerable to stolen smart card and offline dictionary attack, replay attack, impersonation attack and server spoofing attack. By analyzing other similar schemes, we find that the certain type of dynamic ID based multi-server authentication scheme in which only hash functions are used and no registration center participates in the authentication and session key agreement phase is hard to provide perfect efficient and secure authentication. To compensate for these shortcomings, we improve the recently proposed Liao et al.'s multi-server authentication scheme which is based on pairing and self-certified public keys, and propose a novel dynamic ID based remote user authentication scheme for multi-server environments. Liao et al.'s scheme is found vulnerable to offline dictionary attack and denial of service attack, and cannot provide user's anonymity and local password verification. However, our proposed scheme overcomes the shortcomings of Liao et al.'s scheme. Security and performance analyses show the proposed scheme is secure against various attacks and has many excellent features.