A Testbed for Cross-Dataset Analysis
This work addresses dataset bias for visual recognition researchers, but it is incremental as it organizes existing data without introducing new methods.
The authors tackled the problem of dataset bias limiting generalization in visual recognition by creating a unified corpus of twelve existing databases and providing a feature repository for the community.
Since its beginning visual recognition research has tried to capture the huge variability of the visual world in several image collections. The number of available datasets is still progressively growing together with the amount of samples per object category. However, this trend does not correspond directly to an increasing in the generalization capabilities of the developed recognition systems. Each collection tends to have its specific characteristics and to cover just some aspects of the visual world: these biases often narrow the effect of the methods defined and tested separately over each image set. Our work makes a first step towards the analysis of the dataset bias problem on a large scale. We organize twelve existing databases in a unique corpus and we present the visual community with a useful feature repository for future research.