CVLGMar 16, 2025

Learning Privacy from Visual Entities

arXiv:2503.12464v12 citationsh-index: 8Proceedings on Privacy Enhancing Technologies
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of image privacy prediction for users, but it is incremental as it focuses on simplifying existing methods rather than introducing a new approach.

The paper tackles the challenge of predicting image privacy by showing that a simpler combination of transfer learning and a CNN, with only 732 parameters, achieves comparable performance to more complex graph-based methods, which have 14,000 to 500 million parameters.

Subjective interpretation and content diversity make predicting whether an image is private or public a challenging task. Graph neural networks combined with convolutional neural networks (CNNs), which consist of 14,000 to 500 millions parameters, generate features for visual entities (e.g., scene and object types) and identify the entities that contribute to the decision. In this paper, we show that using a simpler combination of transfer learning and a CNN to relate privacy with scene types optimises only 732 parameters while achieving comparable performance to that of graph-based methods. On the contrary, end-to-end training of graph-based methods can mask the contribution of individual components to the classification performance. Furthermore, we show that a high-dimensional feature vector, extracted with CNNs for each visual entity, is unnecessary and complexifies the model. The graph component has also negligible impact on performance, which is driven by fine-tuning the CNN to optimise image features for privacy nodes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes