Fuse Local and Global Semantics in Representation Learning
This work addresses representation learning for computer vision tasks, but appears incremental as it builds on existing methods without specifying major breakthroughs.
The paper tackles the problem of generating richer image representations by fusing local and global semantics, showing promising results in linear evaluation and transferability to detection and segmentation tasks on PASCAL VOC and COCO datasets.
We propose Fuse Local and Global Semantics in Representation Learning (FLAGS) to generate richer representations. FLAGS aims at extract both global and local semantics from images to benefit various downstream tasks. It shows promising results under common linear evaluation protocol. We also conduct detection and segmentation on PASCAL VOC and COCO to show the representations extracted by FLAGS are transferable.