A Global-Local Emebdding Module for Fashion Landmark Detection
This work addresses the challenge of accurately localizing fashion landmarks for visual clothing analysis, which is an incremental improvement in a domain-specific task.
The paper tackles the problem of fashion landmark detection by proposing a network with a global-local embedding module to handle large spatial variances in clothing images, and it reports outperforming state-of-the-art methods on two benchmark datasets.
Detecting fashion landmarks is a fundamental technique for visual clothing analysis. Due to the large variation and non-rigid deformation of clothes, localizing fashion landmarks suffers from large spatial variances across poses, scales, and styles. Therefore, understanding contextual knowledge of clothes is required for accurate landmark detection. To that end, in this paper, we propose a fashion landmark detection network with a global-local embedding module. The global-local embedding module is based on a non-local operation for capturing long-range dependencies and a subsequent convolution operation for adopting local neighborhood relations. With this processing, the network can consider both global and local contextual knowledge for a clothing image. We demonstrate that our proposed method has an excellent ability to learn advanced deep feature representations for fashion landmark detection. Experimental results on two benchmark datasets show that the proposed network outperforms the state-of-the-art methods. Our code is available at https://github.com/shumming/GLE_FLD.