CROct 28, 2016

Semantic Identification Attacks on Web Browsing

arXiv:1610.09417v1
Originality Incremental advance
AI Analysis

This addresses a privacy vulnerability for web users, exposing a new attack vector that bypasses existing disguise measures, though it is incremental in building on prior identification methods.

The paper tackles the problem of user identification across browsing sessions using semantic signals from visited pages, showing that even coarse semantic information is sufficient to identify users, as demonstrated on the MSNBC Anonymous Browsing dataset.

We introduce a Semantic Identification Attack, in which an adversary uses semantic signals about the pages visited in one browsing session to identify other browsing sessions launched by the same user. This attack allows an adver- sary to determine if two browsing sessions originate from the same user regardless of any measures taken by the user to disguise their browser or network. We use the MSNBC Anonymous Browsing data set, which contains a large set of user visits (labeled by category) to implement such an attack and show that even very coarse semantic information is enough to identify users. We discuss potential counter- measures users can take to defend against this attack.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes