Single Level Feature-to-Feature Forecasting with Deformable Convolutions
This addresses the problem of anticipating future frames for autonomous driving systems, representing an incremental improvement over existing methods.
The paper tackles future semantic segmentation forecasting in driving scenarios by using deformable convolutions for feature-to-feature prediction, achieving state-of-the-art performance on the Cityscapes validation set for nine timesteps ahead.
Future anticipation is of vital importance in autonomous driving and other decision-making systems. We present a method to anticipate semantic segmentation of future frames in driving scenarios based on feature-to-feature forecasting. Our method is based on a semantic segmentation model without lateral connections within the upsampling path. Such design ensures that the forecasting addresses only the most abstract features on a very coarse resolution. We further propose to express feature-to-feature forecasting with deformable convolutions. This increases the modelling power due to being able to represent different motion patterns within a single feature map. Experiments show that our models with deformable convolutions outperform their regular and dilated counterparts while minimally increasing the number of parameters. Our method achieves state of the art performance on the Cityscapes validation set when forecasting nine timesteps into the future.