Examining CNN Representations with respect to Dataset Bias
This addresses the issue of dataset bias in CNNs for researchers, but it is incremental as it builds on existing bias analysis methods.
The paper tackles the problem of diagnosing feature representation flaws in pre-trained CNNs caused by dataset bias, by mining latent attribute relationships and comparing them to ground-truth relationships, demonstrating effectiveness in experiments.
Given a pre-trained CNN without any testing samples, this paper proposes a simple yet effective method to diagnose feature representations of the CNN. We aim to discover representation flaws caused by potential dataset bias. More specifically, when the CNN is trained to estimate image attributes, we mine latent relationships between representations of different attributes inside the CNN. Then, we compare the mined attribute relationships with ground-truth attribute relationships to discover the CNN's blind spots and failure modes due to dataset bias. In fact, representation flaws caused by dataset bias cannot be examined by conventional evaluation strategies based on testing images, because testing images may also have a similar bias. Experiments have demonstrated the effectiveness of our method.