Hotels-50K: A Global Hotel Recognition Dataset
This dataset addresses the challenge of hotel recognition for law enforcement and investigators in human trafficking cases, though it is incremental as it focuses on data curation and baseline methods.
The authors tackled the problem of recognizing hotels from images of hotel rooms to aid human trafficking investigations, and they introduced a dataset of over 1 million annotated images from 50,000 hotels, including a baseline approach with domain-specific data augmentation.
Recognizing a hotel from an image of a hotel room is important for human trafficking investigations. Images directly link victims to places and can help verify where victims have been trafficked, and where their traffickers might move them or others in the future. Recognizing the hotel from images is challenging because of low image quality, uncommon camera perspectives, large occlusions (often the victim), and the similarity of objects (e.g., furniture, art, bedding) across different hotel rooms. To support efforts towards this hotel recognition task, we have curated a dataset of over 1 million annotated hotel room images from 50,000 hotels. These images include professionally captured photographs from travel websites and crowd-sourced images from a mobile application, which are more similar to the types of images analyzed in real-world investigations. We present a baseline approach based on a standard network architecture and a collection of data-augmentation approaches tuned to this problem domain.