Crowdsourcing of Real-world Image Annotation via Visual Properties
For researchers in data-centric AI and computer vision, this work addresses the semantic gap problem in object recognition datasets by reducing annotator bias, but it is an incremental improvement over existing crowdsourcing methods.
The paper proposes an interactive crowdsourcing framework that uses visual property constraints and a category hierarchy to reduce annotator subjectivity in image annotation. Experiments show the method is effective, with annotator feedback used to optimize the setup.
Recent advances in data-centric artificial intelligence highlight inherent limitations in object recognition datasets. One of the primary issues stems from the semantic gap problem, which results in complex many-to-many mappings between visual data and linguistic descriptions. This bias adversely affects performance in computer vision tasks. This paper proposes an image annotation methodology that integrates knowledge representation, natural language processing, and computer vision techniques, aiming to reduce annotator subjectivity by applying visual property constraints. We introduce an interactive crowdsourcing framework that dynamically asks questions based on a predefined object category hierarchy and annotator feedback, guiding image annotation by visual properties. Experiments demonstrate the effectiveness of this methodology, and annotator feedback is discussed to optimize the crowdsourcing setup.