Fabian Burger

CV
5papers
107citations
Novelty55%
AI Score29

5 Papers

CVOct 17, 2023
Holistic Parking Slot Detection with Polygon-Shaped Representations

Lihao Wang, Antonyo Musabini, Christel Leonet et al.

Current parking slot detection in advanced driver-assistance systems (ADAS) primarily relies on ultrasonic sensors. This method has several limitations such as the need to scan the entire parking slot before detecting it, the incapacity of detecting multiple slots in a row, and the difficulty of classifying them. Due to the complex visual environment, vehicles are equipped with surround view camera systems to detect vacant parking slots. Previous research works in this field mostly use image-domain models to solve the problem. These two-stage approaches separate the 2D detection and 3D pose estimation steps using camera calibration. In this paper, we propose one-step Holistic Parking Slot Network (HPS-Net), a tailor-made adaptation of the You Only Look Once (YOLO)v4 algorithm. This camera-based approach directly outputs the four vertex coordinates of the parking slot in topview domain, instead of a bounding box in raw camera images. Several visible points and shapes can be proposed from different angles. A novel regression loss function named polygon-corner Generalized Intersection over Union (GIoU) for polygon vertex position optimization is also proposed to manage the slot orientation and to distinguish the entrance line. Experiments show that HPS-Net can detect various vacant parking slots with a F1-score of 0.92 on our internal Valeo Parking Slots Dataset (VPSD) and 0.99 on the public dataset PS2.0. It provides a satisfying generalization and robustness in various parking scenarios, such as indoor (F1: 0.86) or paved ground (F1: 0.91). Moreover, it achieves a real-time detection speed of 17 FPS on Nvidia Drive AGX Xavier. A demo video can be found at https://streamable.com/75j7sj.

CVAug 22, 2024
Enhanced Parking Perception by Multi-Task Fisheye Cross-view Transformers

Antonyo Musabini, Ivan Novikov, Sana Soula et al.

Current parking area perception algorithms primarily focus on detecting vacant slots within a limited range, relying on error-prone homographic projection for both labeling and inference. However, recent advancements in Advanced Driver Assistance System (ADAS) require interaction with end-users through comprehensive and intelligent Human-Machine Interfaces (HMIs). These interfaces should present a complete perception of the parking area going from distinguishing vacant slots' entry lines to the orientation of other parked vehicles. This paper introduces Multi-Task Fisheye Cross View Transformers (MT F-CVT), which leverages features from a four-camera fisheye Surround-view Camera System (SVCS) with multihead attentions to create a detailed Bird-Eye View (BEV) grid feature map. Features are processed by both a segmentation decoder and a Polygon-Yolo based object detection decoder for parking slots and vehicles. Trained on data labeled using LiDAR, MT F-CVT positions objects within a 25m x 25m real open-road scenes with an average error of only 20 cm. Our larger model achieves an F-1 score of 0.89. Moreover the smaller model operates at 16 fps on an Nvidia Jetson Orin embedded board, with similar detection results to the larger one. MT F-CVT demonstrates robust generalization capability across different vehicles and camera rig configurations. A demo video from an unseen vehicle and camera rig is available at: https://streamable.com/jjw54x.

CVJul 1, 2020
TiledSoilingNet: Tile-level Soiling Detection on Automotive Surround-view Cameras Using Coverage Metric

Arindam Das, Pavel Krizek, Ganesh Sistu et al.

Automotive cameras, particularly surround-view cameras, tend to get soiled by mud, water, snow, etc. For higher levels of autonomous driving, it is necessary to have a soiling detection algorithm which will trigger an automatic cleaning system. Localized detection of soiling in an image is necessary to control the cleaning system. It is also necessary to enable partial functionality in unsoiled areas while reducing confidence in soiled areas. Although this can be solved using a semantic segmentation task, we explore a more efficient solution targeting deployment in low power embedded system. We propose a novel method to regress the area of each soiling type within a tile directly. We refer to this as coverage. The proposed approach is better than learning the dominant class in a tile as multiple soiling types occur within a tile commonly. It also has the advantage of dealing with coarse polygon annotation, which will cause the segmentation task. The proposed soiling coverage decoder is an order of magnitude faster than an equivalent segmentation decoder. We also integrated it into an object detection and semantic segmentation multi-task model using an asynchronous back-propagation algorithm. A portion of the dataset used will be released publicly as part of our WoodScape dataset to encourage further research.

CVJan 7, 2020
Dynamic Task Weighting Methods for Multi-task Networks in Autonomous Driving Systems

Isabelle Leang, Ganesh Sistu, Fabian Burger et al.

Deep multi-task networks are of particular interest for autonomous driving systems. They can potentially strike an excellent trade-off between predictive performance, hardware constraints and efficient use of information from multiple types of annotations and modalities. However, training such models is non-trivial and requires balancing learning over all tasks as their respective losses display different scales, ranges and dynamics across training. Multiple task weighting methods that adjust the losses in an adaptive way have been proposed recently on different datasets and combinations of tasks, making it difficult to compare them. In this work, we review and systematically evaluate nine task weighting strategies on common grounds on three automotive datasets (KITTI, Cityscapes and WoodScape). We then propose a novel method combining evolutionary meta-learning and task-based selective backpropagation, for computing task weights leading to reliable network training. Our method outperforms state-of-the-art methods by a significant margin on a two-task application.

CVDec 4, 2019
Let's Get Dirty: GAN Based Data Augmentation for Camera Lens Soiling Detection in Autonomous Driving

Michal Uricar, Ganesh Sistu, Hazem Rashed et al.

Wide-angle fisheye cameras are commonly used in automated driving for parking and low-speed navigation tasks. Four of such cameras form a surround-view system that provides a complete and detailed view of the vehicle. These cameras are directly exposed to harsh environmental settings and can get soiled very easily by mud, dust, water, frost. Soiling on the camera lens can severely degrade the visual perception algorithms, and a camera cleaning system triggered by a soiling detection algorithm is increasingly being deployed. While adverse weather conditions, such as rain, are getting attention recently, there is only limited work on general soiling. The main reason is the difficulty in collecting a diverse dataset as it is a relatively rare event. We propose a novel GAN based algorithm for generating unseen patterns of soiled images. Additionally, the proposed method automatically provides the corresponding soiling masks eliminating the manual annotation cost. Augmentation of the generated soiled images for training improves the accuracy of soiling detection tasks significantly by 18% demonstrating its usefulness. The manually annotated soiling dataset and the generated augmentation dataset will be made public. We demonstrate the generalization of our fisheye trained GAN model on the Cityscapes dataset. We provide an empirical evaluation of the degradation of the semantic segmentation algorithm with the soiled data.