Image Reconstruction from Bag-of-Visual-Words
This addresses a challenge in computer vision for researchers and practitioners by enabling image reconstruction from a widely used feature, though it appears incremental as it builds on existing BoVW methods.
The paper tackles the problem of reconstructing original images from Bag-of-Visual-Words (BoVW) features, which lack spatial information, by proposing an evaluation function that incorporates local adjacency and global position naturalness, and it successfully reconstructs images of 101 object kinds.
The objective of this work is to reconstruct an original image from Bag-of-Visual-Words (BoVW). Image reconstruction from features can be a means of identifying the characteristics of features. Additionally, it enables us to generate novel images via features. Although BoVW is the de facto standard feature for image recognition and retrieval, successful image reconstruction from BoVW has not been reported yet. What complicates this task is that BoVW lacks the spatial information for including visual words. As described in this paper, to estimate an original arrangement, we propose an evaluation function that incorporates the naturalness of local adjacency and the global position, with a method to obtain related parameters using an external image database. To evaluate the performance of our method, we reconstruct images of objects of 101 kinds. Additionally, we apply our method to analyze object classifiers and to generate novel images via BoVW.