Hierarchical Annotation of Images with Two-Alternative-Forced-Choice Metric Learning
This addresses the need for efficient data structuring in tasks like retrieval and recommendations, though it is incremental as it builds on existing metric learning techniques.
The paper tackles the problem of labor-intensive hierarchical annotation of high-dimensional data like images by proposing a method using two-alternative-forced-choice testing and deep metric learning to embed data in a semantic space for hierarchical clustering, achieving finer granularity than original labels on the Fashion-MNIST dataset.
Many tasks such as retrieval and recommendations can significantly benefit from structuring the data, commonly in a hierarchical way. To achieve this through annotations of high dimensional data such as images or natural text can be significantly labor intensive. We propose an approach for uncovering the hierarchical structure of data based on efficient discriminative testing rather than annotations of individual datapoints. Using two-alternative-forced-choice (2AFC) testing and deep metric learning we achieve embedding of the data in semantic space where we are able to successfully hierarchically cluster. We actively select triplets for the 2AFC test such that the modeling process is highly efficient with respect to the number of tests presented to the annotator. We empirically demonstrate the feasibility of the method by confirming the shape bias on synthetic data and extract hierarchical structure on the Fashion-MNIST dataset to a finer granularity than the original labels.