CVAug 16, 2023Code
GPA-3D: Geometry-aware Prototype Alignment for Unsupervised Domain Adaptive 3D Object Detection from Point CloudsZiyu Li, Jingming Guo, Tongtong Cao et al.
LiDAR-based 3D detection has made great progress in recent years. However, the performance of 3D detectors is considerably limited when deployed in unseen environments, owing to the severe domain gap problem. Existing domain adaptive 3D detection methods do not adequately consider the problem of the distributional discrepancy in feature space, thereby hindering generalization of detectors across domains. In this work, we propose a novel unsupervised domain adaptive \textbf{3D} detection framework, namely \textbf{G}eometry-aware \textbf{P}rototype \textbf{A}lignment (\textbf{GPA-3D}), which explicitly leverages the intrinsic geometric relationship from point cloud objects to reduce the feature discrepancy, thus facilitating cross-domain transferring. Specifically, GPA-3D assigns a series of tailored and learnable prototypes to point cloud objects with distinct geometric structures. Each prototype aligns BEV (bird's-eye-view) features derived from corresponding point cloud objects on source and target domains, reducing the distributional discrepancy and achieving better adaptation. The evaluation results obtained on various benchmarks, including Waymo, nuScenes and KITTI, demonstrate the superiority of our GPA-3D over the state-of-the-art approaches for different adaptation scenarios. The MindSpore version code will be publicly available at \url{https://github.com/Liz66666/GPA3D}.
CVJan 16, 2024Code
Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and OpportunitiesXu Yan, Haiming Zhang, Yingjie Cai et al.
The rise of large foundation models, trained on extensive datasets, is revolutionizing the field of AI. Models such as SAM, DALL-E2, and GPT-4 showcase their adaptability by extracting intricate patterns and performing effectively across diverse tasks, thereby serving as potent building blocks for a wide range of AI applications. Autonomous driving, a vibrant front in AI applications, remains challenged by the lack of dedicated vision foundation models (VFMs). The scarcity of comprehensive training data, the need for multi-sensor integration, and the diverse task-specific architectures pose significant obstacles to the development of VFMs in this field. This paper delves into the critical challenge of forging VFMs tailored specifically for autonomous driving, while also outlining future directions. Through a systematic analysis of over 250 papers, we dissect essential techniques for VFM development, including data preparation, pre-training strategies, and downstream task adaptation. Moreover, we explore key advancements such as NeRF, diffusion models, 3D Gaussian Splatting, and world models, presenting a comprehensive roadmap for future research. To empower researchers, we have built and maintained https://github.com/zhanghm1995/Forge_VFM4AD, an open-access repository constantly updated with the latest advancements in forging VFMs for autonomous driving.
LGNov 1, 2024Code
MoNTA: Accelerating Mixture-of-Experts Training with Network-Traffc-Aware Parallel OptimizationJingming Guo, Yan Liu, Yu Meng et al.
The Mixture of Experts (MoE) is an advanced model architecture in the industry that combines multiple specialized expert models from various domains into a single supermodel. This approach enables the model to scale without significantly increasing the computational costs of training and inference, while maximizing model performance. However, current distributed training frameworks do not consider the ultimate optimization of communication, especially for large base models. This paper proposes a network-traffic-aware parallel optimization method that selects the optimal parallel strategy based on the communication volume, and the training cluster's inter-node and intra-node network topologies. Compared to the DeepSpeed, MoNTA achieves an 8x increase in AllToAll communication performance under 8-card tensor parallelism. Compared to the baseline, training a 2x70B model using 16 A800 cards, with an 8K sequence, results in a 13% overall latency performance improvement. Project Page: https://github.com/EnflameTechnology/DeepSpeed.
CVOct 8, 2021
How to Build a Curb Dataset with LiDAR Data for Autonomous DrivingDongfeng Bai, Tongtong Cao, Jingming Guo et al.
Curbs are one of the essential elements of urban and highway traffic environments. Robust curb detection provides road structure information for motion planning in an autonomous driving system. Commonly, video cameras and 3D LiDARs are mounted on autonomous vehicles for curb detection. However, camera-based methods suffer from challenging illumination conditions. During the long period of time before wide application of Deep Neural Network (DNN) with point clouds, LiDAR-based curb detection methods are based on hand-crafted features, which suffer from poor detection in some complex scenes. Recently, DNN-based dynamic object detection using LiDAR data has become prevalent, while few works pay attention to curb detection with a DNN approach due to lack of labeled data. A dataset with curb annotations or an efficient curb labeling approach, hence, is of high demand...
CRJan 7, 2020
Provenance-based Classification Policy based on Encrypted SearchXinyu Fan, Faen Zhang, Jiahong Wu et al.
As an important type of cloud data, digital provenance is arousing increasing attention on improving system performance. Currently, provenance has been employed to provide cues regarding access control and to estimate data quality. However, provenance itself might also be sensitive information. Therefore, provenance might be encrypted and stored in the Cloud. In this paper, we provide a mechanism to classify cloud documents by searching specific keywords from their encrypted provenance, and we prove our scheme achieves semantic security. In term of application of the proposed techniques, considering that files are classified to store separately in the cloud, in order to facilitate the regulation and security protection for the files, the classification policies can use provenance as conditions to determine the category of a document. Such as the easiest sample policy goes like: the documents have been reviewed twice can be classified as "public accessible", which can be accessed by the public.
CRJan 7, 2020
A fine-grained policy model for Provenance-based Access Control and Policy Algebras.pdfXinyu Fan, Faen Zhang, Jianfei Song et al.
A fine-grained provenance-based access control policy model is proposed in this paper, in order to improve the express performance of existing model. This method employs provenance as conditions to determine whether a piece of data can be accessed because historical operations performed on data could reveal clues about its sensitivity and vulnerability. Particularly, our proposed work provides a four-valued decision set which allows showing status to match a restriction particularly. This framework consists of target policy, access control policy, and policy algebras. With the complete definition and algebra system construction, a practical fine-grained access control policy model is developed.
CRDec 1, 2019
PACLP: a fine-grained partition-based access control policy language for provenanceXinyu Fan, Faen Zhang, Jianfei Song et al.
Even though the idea of partitioning provenance graphs for access control was previously proposed, employing segments of the provenance DAG for fine-grained access control to provenance data has not been thoroughly explored. Hence, we take segments of a provenance graph, based on the extended OPM, and defined use a variant of regular expressions, and utilize them in our fine-grained access control language. It can not only return partial graphs to answer access requests but also introduce segments as restrictions in order to screen targeted data.