3.8NIMay 4
Early-Stage IoT Device Identification Using Passive Network Traffic AnalysisAlex Ciechonski, Fabio Palmese, Alessandro E. C. Redondi et al.
The rapid proliferation of Internet of Things (IoT) devices introduces significant security challenges due to limited visibility and weak device-level guarantees. Accurate and timely identification of devices is essential for enforcing network policies and detecting unauthorised hardware, yet existing approaches often rely on long-term traffic observation, payload inspection, or infrastructure-dependent features. In this paper, we investigate whether IoT devices can be reliably identified during the early stages of network attachment using only passive traffic analysis. We propose a lightweight approach based on flow-level features extracted from metadata, avoiding payload inspection and active probing. Through systematic evaluation across multiple observation windows, we show that device-specific signatures emerge within the first few seconds of communication, enabling high-accuracy identification (up to 99%) across 37 IoT devices. Notably, extending the observation window does not consistently improve performance and may slightly degrade accuracy, indicating that the most discriminative behaviour occurs during initial device startup. These findings demonstrate the feasibility of fast, privacy-preserving IoT device identification at the network edge, supporting real-time enforcement, device inventory, and anomaly detection in practical deployments.
HCMar 20, 2025
Big Help or Big Brother? Auditing Tracking, Profiling, and Personalization in Generative AI AssistantsYash Vekaria, Aurelio Loris Canino, Jonathan Levitsky et al.
Generative AI (GenAI) browser assistants integrate powerful capabilities of GenAI in web browsers to provide rich experiences such as question answering, content summarization, and agentic navigation. These assistants, available today as browser extensions, can not only track detailed browsing activity such as search and click data, but can also autonomously perform tasks such as filling forms, raising significant privacy concerns. It is crucial to understand the design and operation of GenAI browser extensions, including how they collect, store, process, and share user data. To this end, we study their ability to profile users and personalize their responses based on explicit or inferred demographic attributes and interests of users. We perform network traffic analysis and use a novel prompting framework to audit tracking, profiling, and personalization by the ten most popular GenAI browser assistant extensions. We find that instead of relying on local in-browser models, these assistants largely depend on server-side APIs, which can be auto-invoked without explicit user interaction. When invoked, they collect and share webpage content, often the full HTML DOM and sometimes even the user's form inputs, with their first-party servers. Some assistants also share identifiers and user prompts with third-party trackers such as Google Analytics. The collection and sharing continues even if a webpage contains sensitive information such as health or personal information such as name or SSN entered in a web form. We find that several GenAI browser assistants infer demographic attributes such as age, gender, income, and interests and use this profile--which carries across browsing contexts--to personalize responses. In summary, our work shows that GenAI browser assistants can and do collect personal and sensitive information for profiling and personalization with little to no safeguards.
CRApr 22, 2025
Intelligent Detection of Non-Essential IoT Traffic on the Home GatewayFabio Palmese, Anna Maria Mandalari, Hamed Haddadi et al.
The rapid expansion of Internet of Things (IoT) devices, particularly in smart home environments, has introduced considerable security and privacy concerns due to their persistent connectivity and interaction with cloud services. Despite advancements in IoT security, effective privacy measures remain uncovered, with existing solutions often relying on cloud-based threat detection that exposes sensitive data or outdated allow-lists that inadequately restrict non-essential network traffic. This work presents ML-IoTrim, a system for detecting and mitigating non-essential IoT traffic (i.e., not influencing the device operations) by analyzing network behavior at the edge, leveraging Machine Learning to classify network destinations. Our approach includes building a labeled dataset based on IoT device behavior and employing a feature-extraction pipeline to enable a binary classification of essential vs. non-essential network destinations. We test our framework in a consumer smart home setup with IoT devices from five categories, demonstrating that the model can accurately identify and block non-essential traffic, including previously unseen destinations, without relying on traditional allow-lists. We implement our solution on a home access point, showing the framework has strong potential for scalable deployment, supporting near-real-time traffic classification in large-scale IoT environments with hundreds of devices. This research advances privacy-aware traffic control in smart homes, paving the way for future developments in IoT device privacy.
CRJan 25, 2024
SunBlock: Cloudless Protection for IoT SystemsVadim Safronov, Anna Maria Mandalari, Daniel J. Dubois et al.
With an increasing number of Internet of Things (IoT) devices present in homes, there is a rise in the number of potential information leakage channels and their associated security threats and privacy risks. Despite a long history of attacks on IoT devices in unprotected home networks, the problem of accurate, rapid detection and prevention of such attacks remains open. Many existing IoT protection solutions are cloud-based, sometimes ineffective, and might share consumer data with unknown third parties. This paper investigates the potential for effective IoT threat detection locally, on a home router, using AI tools combined with classic rule-based traffic-filtering algorithms. Our results show that with a slight rise of router hardware resources caused by machine learning and traffic filtering logic, a typical home router instrumented with our solution is able to effectively detect risks and protect a typical home IoT network, equaling or outperforming existing popular solutions, without any effects on benign IoT functionality, and without relying on cloud services and third parties.
LGOct 26, 2021
Rapid IoT Device Identification at the EdgeOliver Thompson, Anna Maria Mandalari, Hamed Haddadi
Consumer Internet of Things (IoT) devices are increasingly common in everyday homes, from smart speakers to security cameras. Along with their benefits come potential privacy and security threats. To limit these threats we must implement solutions to filter IoT traffic at the edge. To this end the identification of the IoT device is the first natural step. In this paper we demonstrate a novel method of rapid IoT device identification that uses neural networks trained on device DNS traffic that can be captured from a DNS server on the local network. The method identifies devices by fitting a model to the first seconds of DNS second-level-domain traffic following their first connection. Since security and privacy threat detection often operate at a device specific level, rapid identification allows these strategies to be implemented immediately. Through a total of 51,000 rigorous automated experiments, we classify 30 consumer IoT devices from 27 different manufacturers with 82% and 93% accuracy for product type and device manufacturers respectively.
CRJul 16, 2021
Revisiting IoT Device IdentificationRoman Kolcun, Diana Andreea Popescu, Vadim Safronov et al.
Internet-of-Things (IoT) devices are known to be the source of many security problems, and as such, they would greatly benefit from automated management. This requires robustly identifying devices so that appropriate network security policies can be applied. We address this challenge by exploring how to accurately identify IoT devices based on their network behavior, while leveraging approaches previously proposed by other researchers. We compare the accuracy of four different previously proposed machine learning models (tree-based and neural network-based) for identifying IoT devices. We use packet trace data collected over a period of six months from a large IoT test-bed. We show that, while all models achieve high accuracy when evaluated on the same dataset as they were trained on, their accuracy degrades over time, when evaluated on data collected outside the training set. We show that on average the models' accuracy degrades after a couple of weeks by up to 40 percentage points (on average between 12 and 21 percentage points). We argue that, in order to keep the models' accuracy at a high level, these need to be continuously updated.
NINov 17, 2020
The Case for Retraining of ML Models for IoT Device Identification at the EdgeRoman Kolcun, Diana Andreea Popescu, Vadim Safronov et al.
Internet-of-Things (IoT) devices are known to be the source of many security problems, and as such they would greatly benefit from automated management. This requires robustly identifying devices so that appropriate network security policies can be applied. We address this challenge by exploring how to accurately identify IoT devices based on their network behavior, using resources available at the edge of the network. In this paper, we compare the accuracy of five different machine learning models (tree-based and neural network-based) for identifying IoT devices by using packet trace data from a large IoT test-bed, showing that all models need to be updated over time to avoid significant degradation in accuracy. In order to effectively update the models, we find that it is necessary to use data gathered from the deployment environment, e.g., the household. We therefore evaluate our approach using hardware resources and data sources representative of those that would be available at the edge of the network, such as in an IoT deployment. We show that updating neural network-based models at the edge is feasible, as they require low computational and memory resources and their structure is amenable to being updated. Our results show that it is possible to achieve device identification and categorization with over 80% and 90% accuracy respectively at the edge.
NIMar 16, 2020
Towards Automatic Identification and Blocking of Non-Critical IoT Traffic DestinationsAnna Maria Mandalari, Roman Kolcun, Hamed Haddadi et al.
The consumer Internet of Things (IoT) space has experienced a significant rise in popularity in the recent years. From smart speakers, to baby monitors, and smart kettles and TVs, these devices are increasingly found in households around the world while users may be unaware of the risks associated with owning these devices. Previous work showed that these devices can threaten individuals' privacy and security by exposing information online to a large number of service providers and third party analytics services. Our analysis shows that many of these Internet connections (and the information they expose) are neither critical, nor even essential to the operation of these devices. However, automatically separating out critical from non-critical network traffic for an IoT device is nontrivial, and requires expert analysis based on manual experimentation in a controlled setting. In this paper, we investigate whether it is possible to automatically classify network traffic destinations as either critical (essential for devices to function properly) or not, hence allowing the home gateway to act as a selective firewall to block undesired, non-critical destinations. Our initial results demonstrate that some IoT devices contact destinations that are not critical to their operation, and there is no impact on device functionality if these destinations are blocked. We take the first steps towards designing and evaluating IoTrimmer, a framework for automated testing and analysis of various destinations contacted by devices, and selectively blocking the ones that do not impact device functionality.