Tagvisor: A Privacy Advisor for Sharing Hashtags
This addresses privacy concerns for social media users by providing a practical tool to mitigate location inference risks from hashtags, though it is incremental as it builds on existing obfuscation techniques.
The paper tackles the problem of privacy risks from hashtags by showing that a random forest model can infer a user's precise location with 70% to 76% accuracy, and introduces Tagvisor, a system that suggests alternative hashtags to protect location privacy while maintaining utility, achieving near-optimal trade-offs by obfuscating as few as two hashtags.
Hashtag has emerged as a widely used concept of popular culture and campaigns, but its implications on people's privacy have not been investigated so far. In this paper, we present the first systematic analysis of privacy issues induced by hashtags. We concentrate in particular on location, which is recognized as one of the key privacy concerns in the Internet era. By relying on a random forest model, we show that we can infer a user's precise location from hashtags with accuracy of 70\% to 76\%, depending on the city. To remedy this situation, we introduce a system called Tagvisor that systematically suggests alternative hashtags if the user-selected ones constitute a threat to location privacy. Tagvisor realizes this by means of three conceptually different obfuscation techniques and a semantics-based metric for measuring the consequent utility loss. Our findings show that obfuscating as little as two hashtags already provides a near-optimal trade-off between privacy and utility in our dataset. This in particular renders Tagvisor highly time-efficient, and thus, practical in real-world settings.