CRSep 21, 2019Code
IoT Inspector: Crowdsourcing Labeled Network Traffic from Smart Home Devices at ScaleDanny Yuxing Huang, Noah Apthorpe, Gunes Acar et al.
The proliferation of smart home devices has created new opportunities for empirical research in ubiquitous computing, ranging from security and privacy to personal health. Yet, data from smart home deployments are hard to come by, and existing empirical studies of smart home devices typically involve only a small number of devices in lab settings. To contribute to data-driven smart home research, we crowdsource the largest known dataset of labeled network traffic from smart home devices from within real-world home networks. To do so, we developed and released IoT Inspector, an open-source tool that allows users to observe the traffic from smart home devices on their own home networks. Since April 2019, 4,322 users have installed IoT Inspector, allowing us to collect labeled network traffic from 44,956 smart home devices across 13 categories and 53 vendors. We demonstrate how this data enables new research into smart homes through two case studies focused on security and privacy. First, we find that many device vendors use outdated TLS versions and advertise weak ciphers. Second, we discover about 350 distinct third-party advertiser and tracking domains on smart TVs. We also highlight other research areas, such as network management and healthcare, that can take advantage of IoT Inspector's dataset. To facilitate future reproducible research in smart homes, we will release the IoT Inspector data to the public.
1.2LGMay 2
From Packets to Patterns: Interpreting Encrypted Network Traffic as Longitudinal Behavioral SignalsRameen Mahmood, Omar El Shahawy, Souptik Barua et al.
Human behavior is difficult to observe continuously at scale, yet it leaves measurable traces in everyday device use. We test whether encrypted smartphone network traffic -- a ubiquitous, always-on, passive sensing modality -- can passively capture behavioral patterns related to sleep, stress, and loneliness. We model shared behavioral structure using a transformer backbone with per-user adapters, allowing the model to represent both typical individual behavior and deviations from it. To make these representations interpretable, we apply a sparse autoencoder to extract behavioral features corresponding to distinct patterns of activity. We relate these features to sleep disturbance, stress, and loneliness using generalized estimating equations with Mundlak decomposition, separating between-person differences from within-person changes over time. We find that the three outcomes reflect distinct temporal structures: stress is primarily associated with stable between-person differences, loneliness with within-person variation, and sleep disturbance with a combination of both. Notably, these within-person dynamics are not captured by predefined network-traffic features, demonstrating the value of learned representations for longitudinal behavioral sensing. These results establish encrypted network traffic as a viable passive sensing modality, revealing interpretable behavioral dynamics -- particularly deviations from an individual's baseline -- that are not visible in raw traffic features.
CRNov 21, 2024
Learned, Lagged, LLM-splained: LLM Responses to End User Security QuestionsVijay Prakash, Kevin Lee, Arkaprabha Bhattacharya et al.
Answering end user security questions is challenging. While large language models (LLMs) like GPT, LLAMA, and Gemini are far from error-free, they have shown promise in answering a variety of questions outside of security. We studied LLM performance in the area of end user security by qualitatively evaluating 3 popular LLMs on 900 systematically collected end user security questions. While LLMs demonstrate broad generalist ``knowledge'' of end user security information, there are patterns of errors and limitations across LLMs consisting of stale and inaccurate answers, and indirect or unresponsive communication styles, all of which impacts the quality of information received. Based on these patterns, we suggest directions for model improvement and recommend user strategies for interacting with LLMs when seeking assistance with security.
LGSep 24, 2025
Large Language Models for Real-World IoT Device IdentificationRameen Mahmood, Tousif Ahmed, Sai Teja Peddinti et al.
The rapid expansion of IoT devices has outpaced current identification methods, creating significant risks for security, privacy, and network accountability. These challenges are heightened in open-world environments, where traffic metadata is often incomplete, noisy, or intentionally obfuscated. We introduce a semantic inference pipeline that reframes device identification as a language modeling task over heterogeneous network metadata. To construct reliable supervision, we generate high-fidelity vendor labels for the IoT Inspector dataset, the largest real-world IoT traffic corpus, using an ensemble of large language models guided by mutual-information and entropy-based stability scores. We then instruction-tune a quantized LLaMA3.18B model with curriculum learning to support generalization under sparsity and long-tail vendor distributions. Our model achieves 98.25% top-1 accuracy and 90.73% macro accuracy across 2,015 vendors while maintaining resilience to missing fields, protocol drift, and adversarial manipulation. Evaluation on an independent IoT testbed, coupled with explanation quality and adversarial stress tests, demonstrates that instruction-tuned LLMs provide a scalable and interpretable foundation for real-world device identification at scale.
MAFeb 5, 2021
SkillBot: Identifying Risky Content for Children in Alexa SkillsTu Le, Danny Yuxing Huang, Noah Apthorpe et al.
Many households include children who use voice personal assistants (VPA) such as Amazon Alexa. Children benefit from the rich functionalities of VPAs and third-party apps but are also exposed to new risks in the VPA ecosystem. In this paper, we first investigate "risky" child-directed voice apps that contain inappropriate content or ask for personal information through voice interactions. We build SkillBot - a natural language processing (NLP)-based system to automatically interact with VPA apps and analyze the resulting conversations. We find 28 risky child-directed apps and maintain a growing dataset of 31,966 non-overlapping app behaviors collected from 3,434 Alexa apps. Our findings suggest that although child-directed VPA apps are subject to stricter policy requirements and more intensive vetting, children remain vulnerable to inappropriate content and privacy violations. We then conduct a user study showing that parents are concerned about the identified risky apps. Many parents do not believe that these apps are available and designed for families/kids, although these apps are actually published in Amazon's "Kids" product category. We also find that parents often neglect basic precautions such as enabling parental controls on Alexa devices. Finally, we identify a novel risk in the VPA ecosystem: confounding utterances, or voice commands shared by multiple apps that may cause a user to interact with a different app than intended. We identify 4,487 confounding utterances, including 581 shared by child-directed and non-child-directed apps. We find that 27% of these confounding utterances prioritize invoking a non-child-directed app over a child-directed app. This indicates that children are at real risk of accidentally invoking non-child-directed apps due to confounding utterances.
HCOct 30, 2019
Alexa, Who Am I Speaking To? Understanding Users' Ability to Identify Third-Party Apps on Amazon AlexaDavid J. Major, Danny Yuxing Huang, Marshini Chetty et al.
Many Internet of Things (IoT) devices have voice user interfaces (VUIs). One of the most popular VUIs is Amazon's Alexa, which supports more than 47,000 third-party applications ("skills"). We study how Alexa's integration of these skills may confuse users. Our survey of 237 participants found that users do not understand that skills are often operated by third parties, that they often confuse third-party skills with native Alexa functions, and that they are unaware of the functions that the native Alexa system supports. Surprisingly, users who interact with Alexa more frequently are more likely to conclude that a third-party skill is native Alexa functionality. The potential for misunderstanding creates new security and privacy risks: attackers can develop third-party skills that operate without users' knowledge or masquerade as native Alexa functions. To mitigate this threat, we make design recommendations to help users distinguish native and third-party skills.
GTFeb 26, 2019
Selling a Single Item with Negative ExternalitiesTithi Chattopadhyay, Nick Feamster, Matheus V. X. Ferreira et al.
We consider the problem of regulating products with negative externalities to a third party that is neither the buyer nor the seller, but where both the buyer and seller can take steps to mitigate the externality. The motivating example to have in mind is the sale of Internet-of-Things (IoT) devices, many of which have historically been compromised for DDoS attacks that disrupted Internet-wide services such as Twitter. Neither the buyer (i.e., consumers) nor seller (i.e., IoT manufacturers) was known to suffer from the attack, but both have the power to expend effort to secure their devices. We consider a regulator who regulates payments (via fines if the device is compromised, or market prices directly), or the product directly via mandatory security requirements. Both regulations come at a cost---implementing security requirements increases production costs, and the existence of fines decreases consumers' values---thereby reducing the seller's profits. The focus of this paper is to understand the \emph{efficiency} of various regulatory policies. That is, policy A is more efficient than policy B if A more successfully minimizes negatives externalities, while both A and B reduce seller's profits equally. We develop a simple model to capture the impact of regulatory policies on a buyer's behavior. {In this model, we show that for \textit{homogeneous} markets---where the buyer's ability to follow security practices is always high or always low---the optimal (externality-minimizing for a given profit constraint) regulatory policy need regulate \emph{only} payments \emph{or} production.} In arbitrary markets, by contrast, we show that while the optimal policy may require regulating both aspects, there is always an approximately optimal policy which regulates just one.
CRDec 3, 2018
Keeping the Smart Home Private with Smart(er) IoT Traffic ShapingNoah Apthorpe, Danny Yuxing Huang, Dillon Reisman et al.
The proliferation of smart home Internet of Things (IoT) devices presents unprecedented challenges for preserving privacy within the home. In this paper, we demonstrate that a passive network observer (e.g., an Internet service provider) can infer private in-home activities by analyzing Internet traffic from commercially available smart home devices even when the devices use end-to-end transport-layer encryption. We evaluate common approaches for defending against these types of traffic analysis attacks, including firewalls, virtual private networks, and independent link padding, and find that none sufficiently conceal user activities with reasonable data overhead. We develop a new defense, "stochastic traffic padding" (STP), that makes it difficult for a passive network adversary to reliably distinguish genuine user activities from generated traffic patterns designed to look like user interactions. Our analysis provides a theoretical bound on an adversary's ability to accurately detect genuine user activities as a function of the amount of additional cover traffic generated by the defense technique.