Cross-connected Networks for Multi-task Learning of Detection and Segmentation
This addresses a practical challenge in multi-task learning for computer vision, enabling knowledge transfer across tasks with different datasets, though it is incremental in extending existing multi-task approaches.
The paper tackles the problem of multi-task learning when tasks are trained on separate datasets, proposing a cross-connected CNN architecture that links single-task CNNs to share knowledge. Experiments on pedestrian detection and segmentation showed improved detection performance while maintaining segmentation quality, and tests on wild birds demonstrated learning general representations from limited data.
Multi-task learning improves generalization performance by sharing knowledge among related tasks. Existing models are for task combinations annotated on the same dataset, while there are cases where multiple datasets are available for each task. How to utilize knowledge of successful single-task CNNs that are trained on each dataset has been explored less than multi-task learning with a single dataset. We propose a cross-connected CNN, a new architecture that connects single-task CNNs through convolutional layers, which transfer useful information for the counterpart. We evaluated our proposed architecture on a combination of detection and segmentation using two datasets. Experiments on pedestrians show our CNN achieved a higher detection performance compared to baseline CNNs, while maintaining high quality for segmentation. It is the first known attempt to tackle multi-task learning with different training datasets between detection and segmentation. Experiments with wild birds demonstrate how our CNN learns general representations from limited datasets.