Joint Forecasting of Panoptic Segmentations with Difference Attention
This work improves forecasting accuracy for autonomous driving by enabling joint modeling of object instances, which is incremental over prior methods.
The paper tackles the problem of forecasting panoptic segmentations for autonomous systems by addressing issues of independent object instance treatment and heuristic merging, achieving state-of-the-art results on Cityscapes and AIODrive datasets.
Forecasting of a representation is important for safe and effective autonomy. For this, panoptic segmentations have been studied as a compelling representation in recent work. However, recent state-of-the-art on panoptic segmentation forecasting suffers from two issues: first, individual object instances are treated independently of each other; second, individual object instance forecasts are merged in a heuristic manner. To address both issues, we study a new panoptic segmentation forecasting model that jointly forecasts all object instances in a scene using a transformer model based on 'difference attention.' It further refines the predictions by taking depth estimates into account. We evaluate the proposed model on the Cityscapes and AIODrive datasets. We find difference attention to be particularly suitable for forecasting because the difference of quantities like locations enables a model to explicitly reason about velocities and acceleration. Because of this, we attain state-of-the-art on panoptic segmentation forecasting metrics.