How Tough Is Location Anonymization? Re-identifying 100K Real-User Trajectories in Japan
For practitioners and policymakers dealing with mobility data, the paper demonstrates that existing anonymization methods are insufficient for large-scale trajectory releases.
The paper stress-tests the anonymization of 100,000 real-user trajectories in Japan, showing that current sanitization techniques leave structural leakage that enables re-identification, while strong privacy parameters destroy utility.
Mobility traces are among the most revealing forms of personal data, yet trajectory releases are often protected only by ad hoc transformations. We stress-test such practices on recently-released YJMob100K, an anonymized dataset of 100,000 user trajectories in Japan. First, we show that the applied protection leaves enough spatial and temporal structure to recover both the real-world geographic frame and the actual calendar timeline by exploiting density signatures, urban correlations, and temporal activity profiles. On top of this reconstruction, we quantify privacy risks through trajectory-level metrics that capture spatio-temporal k-anonymity, -point unicity, home-work and multi-anchor uniqueness, and exposure to secluded and sensitive locations. These metrics reveal extensive re-identification surfaces: a small number of observations, anchors, or sensitive venues often suffices to uniquely pinpoint users or their social neighborhoods. Finally, we evaluate representative sanitization strategies: geo-indistinguishability, local differential privacy, and aggressive spatial de-structuring; and observe a consistent pattern: strong privacy parameters destroy downstream utility, while utility-preserving settings leave structural leakage largely intact. Overall, our findings show that current sanitization techniques are insufficient for large-scale mobility data, and they highlight the urgent need for trajectory-aware privacy mechanisms and stronger publication standards.