Visual-Semantic Embedding Model Informed by Structured Knowledge
This work addresses the challenge of enhancing image classification accuracy, particularly in zero-shot scenarios, for computer vision researchers, though it is incremental as it builds on existing embedding methods.
The authors tackled the problem of improving visual-semantic embedding models by incorporating structured knowledge from WordNet, resulting in superior performance in both standard and zero-shot image classification on the ILSVRC 2012 dataset compared to using word embeddings alone.
We propose a novel approach to improve a visual-semantic embedding model by incorporating concept representations captured from an external structured knowledge base. We investigate its performance on image classification under both standard and zero-shot settings. We propose two novel evaluation frameworks to analyse classification errors with respect to the class hierarchy indicated by the knowledge base. The approach is tested using the ILSVRC 2012 image dataset and a WordNet knowledge base. With respect to both standard and zero-shot image classification, our approach shows superior performance compared with the original approach, which uses word embeddings.