12.2CRMay 10
How Tough Is Location Anonymization? Re-identifying 100K Real-User Trajectories in JapanAbhishek Kumar Mishra, Mathieu Cunche, Heber H. Arcolezi
Mobility traces are among the most revealing forms of personal data, yet trajectory releases are often protected only by ad hoc transformations. We stress-test such practices on recently-released YJMob100K, an anonymized dataset of 100,000 user trajectories in Japan. First, we show that the applied protection leaves enough spatial and temporal structure to recover both the real-world geographic frame and the actual calendar timeline by exploiting density signatures, urban correlations, and temporal activity profiles. On top of this reconstruction, we quantify privacy risks through trajectory-level metrics that capture spatio-temporal k-anonymity, -point unicity, home-work and multi-anchor uniqueness, and exposure to secluded and sensitive locations. These metrics reveal extensive re-identification surfaces: a small number of observations, anchors, or sensitive venues often suffices to uniquely pinpoint users or their social neighborhoods. Finally, we evaluate representative sanitization strategies: geo-indistinguishability, local differential privacy, and aggressive spatial de-structuring; and observe a consistent pattern: strong privacy parameters destroy downstream utility, while utility-preserving settings leave structural leakage largely intact. Overall, our findings show that current sanitization techniques are insufficient for large-scale mobility data, and they highlight the urgent need for trajectory-aware privacy mechanisms and stronger publication standards.
7.9CYApr 17
Governance by Design: Architecting Agentic AI for Organizational Learning and Scalable AutonomyNelly Dux, Cristina Alaimo, Philippe Roussiere et al.
Agentic AI systems - systems that can pursue goals through multi-step planning and tool-mediated action with limited direct supervision - are moving from experimental prototypes to enterprise deployments. This transition introduces tensions in implementation, scaling, and governance: organizations seek scalable autonomy for knowledge and coordination work, yet must preserve accountability, safety, cost control, and responsibility as systems initiate actions, access enterprise data, and evolve through iterative updates. Building on an in-depth qualitative case of a large IT services company's 2025 development and staged rollout of an agentic system integrated with enterprise tools; we show that governance is implemented through concrete architectural and working arrangements that determine what the system is allowed to do, which tools and data it can use, how memory is handled, and how performance improvements are introduced over time. We then distill seven lessons that explain how to build effective governance into agentic AI during operationalization and scaling.
CRJan 27, 2021
SimBle: Generating privacy preserving real-world BLE traces with ground truthAbhishek Kumar Mishra, Aline Carneiro Viana, Nadjib Achir
Bluetooth has become critical as many IoT devices are arriving in the market. Most of the current literature focusing on Bluetooth simulation concentrates on the network protocols' performances and completely neglects the privacy protection recommendations introduced in the BLE standard. Indeed, privacy protection is one of the main issues handled in the Bluetooth standard. For instance, the current standard forces devices to change the identifier they embed within the public and private packets, known as MAC address randomization. Although randomizing MAC addresses is intended to preserve device privacy, recent literature shows many challenges that are still present. One of them is the correlation between the public packets and the emitters. Unfortunately, existing evaluation tools such as NS-3 are not designed to reproduce this Bluetooth standard's essential functionality. This makes it impossible to test solutions for different device-fingerprinting strategies as there is a lack of ground truth for large-scale scenarios with the majority of current BLE devices implementing MAC address randomization. In this paper, we first introduce a solution of standard-compliant MAC address randomization in the NS-3 framework, capable of emulating any real BLE device in the simulation and generating real-world Bluetooth traces. In addition, since the simulation run-time for trace-collection grows exponentially with the number of devices, we introduce an optimization to linearize public-packet sniffing. This made the large-scale trace-collection practically feasible. Then, we use the generated traces and associated ground truth to do a case study on the evaluation of a generic MAC address association available in the literature. Our case study reveals that close to 90 percent of randomized addresses could be correctly linked even in highly dense and mobile scenarios.