CVLGOct 23, 2019

Using Segmentation Masks in the ICCV 2019 Learning to Drive Challenge

arXiv:1910.10317v1
Originality Synthesis-oriented
AI Analysis

This work addresses autonomous driving control for improved safety and efficiency, but it is incremental as it builds on existing segmentation and ensemble techniques.

The paper tackled predicting vehicle speed and steering angle from camera images by augmenting raw images with segmentation masks and mirror images, and ensembling three neural network models, achieving second-best performance in MSE angle and overall in the ICCV 2019 Learning to Drive challenge.

In this work we predict vehicle speed and steering angle given camera image frames. Our key contribution is using an external pre-trained neural network for segmentation. We augment the raw images with their segmentation masks and mirror images. We ensemble three diverse neural network models (i) a CNN using a single image and its segmentation mask, (ii) a stacked CNN taking as input a sequence of images and segmentation masks, and (iii) a bidirectional GRU, extracting image features using a pre-trained ResNet34, DenseNet121 and our own CNN single image model. We achieve the second best performance for MSE angle and second best performance overall, to win 2nd place in the ICCV Learning to Drive challenge. We make our models and code publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes