Noah Apthorpe

h-index15

13papers

1,125citations

Novelty42%

AI Score44

Ranked #49,867 of 194,257 authors (top 26%)#1,107 in CR (top 16%)

13 Papers

3.3CYNov 3, 2023Code

Automating Governing Knowledge Commons and Contextual Integrity (GKC-CI) Privacy Policy Annotations with Large Language Models

Jake Chanenson, Madison Pickering, Noah Apthorpe

Identifying contextual integrity (CI) and governing knowledge commons (GKC) parameters in privacy policy texts can facilitate normative privacy analysis. However, GKC-CI annotation has heretofore required manual or crowdsourced effort. This paper demonstrates that high-accuracy GKC-CI parameter annotation of privacy policies can be performed automatically using large language models. We fine-tune 50 open-source and proprietary models on 21,588 ground truth GKC-CI annotations from 16 privacy policies. Our best performing model has an accuracy of 90.65%, which is comparable to the accuracy of experts on the same task. We apply our best performing model to 456 privacy policies from a variety of online services, demonstrating the effectiveness of scaling GKC-CI annotation for privacy policy exploration and analysis. We publicly release our model training code, training and testing data, an annotation visualizer, and all annotated policies for future GKC-CI research.

7.0HCApr 7

Learning Password Best Practices Through In-Task Instruction

Qian Ma, Yingfan Zhou, Shubhang Kaushik et al.

Users often make security- and privacy-relevant decisions without a clear understanding of the rules that govern safe behavior. We introduce pedagogical friction, a design approach that inserts brief, instructional interactions at the moment of action. We evaluate this approach in the context of password creation, a familiar task with clear quality criteria. We conducted a randomized study with 128 participants across four interface conditions that varied the depth and interactivity of guidance. We assessed three outcomes: (1) rule compliance in a subsequent password task without guidance, (2) accuracy on survey questions tied to password rules, and (3) behavior-knowledge alignment, which captures whether participants who correctly followed a rule also recognized it on the survey. Across the guided conditions, participants corrected most rule violations in the follow-up task and showed high behavior-knowledge alignment. Survey results suggested clearer advantages for some rule types, especially symbol related questions. These results position pedagogical friction as a lightweight intervention for security- and privacy-critical interfaces.

20.5CRSep 21, 2019Code

IoT Inspector: Crowdsourcing Labeled Network Traffic from Smart Home Devices at Scale

Danny Yuxing Huang, Noah Apthorpe, Gunes Acar et al.

The proliferation of smart home devices has created new opportunities for empirical research in ubiquitous computing, ranging from security and privacy to personal health. Yet, data from smart home deployments are hard to come by, and existing empirical studies of smart home devices typically involve only a small number of devices in lab settings. To contribute to data-driven smart home research, we crowdsource the largest known dataset of labeled network traffic from smart home devices from within real-world home networks. To do so, we developed and released IoT Inspector, an open-source tool that allows users to observe the traffic from smart home devices on their own home networks. Since April 2019, 4,322 users have installed IoT Inspector, allowing us to collect labeled network traffic from 44,956 smart home devices across 13 categories and 53 vendors. We demonstrate how this data enables new research into smart homes through two case studies focused on security and privacy. First, we find that many device vendors use outdated TLS versions and advertise weak ciphers. Second, we discover about 350 distinct third-party advertiser and tracking domains on smart TVs. We also highlight other research areas, such as network management and healthcare, that can take advantage of IoT Inspector's dataset. To facilitate future reproducible research in smart homes, we will release the IoT Inspector data to the public.

1.2NISep 30, 2021Code

Automating Internet of Things Network Traffic Collection with Robotic Arm Interactions

Xi Jiang, Noah Apthorpe

Consumer Internet of things research often involves collecting network traffic sent or received by IoT devices. These data are typically collected via crowdsourcing or while researchers manually interact with IoT devices in a laboratory setting. However, manual interactions and crowdsourcing are often tedious, expensive, inaccurate, or do not provide comprehensive coverage of possible IoT device behaviors. We present a new method for generating IoT network traffic using a robotic arm to automate user interactions with devices. This eliminates manual button pressing and enables permutation-based interaction sequences that rigorously explore the range of possible device behaviors. We test this approach with an Arduino-controlled robotic arm, a smart speaker, and a smart thermostat, using machine learning to demonstrate that collected network traffic contains information about device interactions that could be useful for network, security, or privacy analyses. We also provide source code and documentation allowing researchers to easily automate IoT device interactions and network traffic collection in future studies.

2.3MAFeb 5, 2021

SkillBot: Identifying Risky Content for Children in Alexa Skills

Tu Le, Danny Yuxing Huang, Noah Apthorpe et al.

Many households include children who use voice personal assistants (VPA) such as Amazon Alexa. Children benefit from the rich functionalities of VPAs and third-party apps but are also exposed to new risks in the VPA ecosystem. In this paper, we first investigate "risky" child-directed voice apps that contain inappropriate content or ask for personal information through voice interactions. We build SkillBot - a natural language processing (NLP)-based system to automatically interact with VPA apps and analyze the resulting conversations. We find 28 risky child-directed apps and maintain a growing dataset of 31,966 non-overlapping app behaviors collected from 3,434 Alexa apps. Our findings suggest that although child-directed VPA apps are subject to stricter policy requirements and more intensive vetting, children remain vulnerable to inappropriate content and privacy violations. We then conduct a user study showing that parents are concerned about the identified risky apps. Many parents do not believe that these apps are available and designed for families/kids, although these apps are actually published in Amazon's "Kids" product category. We also find that parents often neglect basic precautions such as enabling parental controls on Alexa devices. Finally, we identify a novel risk in the VPA ecosystem: confounding utterances, or voice commands shared by multiple apps that may cause a user to interact with a different app than intended. We identify 4,487 confounding utterances, including 581 shared by child-directed and non-child-directed apps. We find that 27% of these confounding utterances prioritize invoking a non-child-directed app over a child-directed app. This indicates that children are at real risk of accidentally invoking non-child-directed apps due to confounding utterances.

3.3HCJan 28, 2020

You, Me, and IoT: How Internet-Connected Consumer Devices Affect Interpersonal Relationships

Noah Apthorpe, Pardis Emami-Naeini, Arunesh Mathur et al.

Internet-connected consumer devices have rapidly increased in popularity; however, relatively little is known about how these technologies are affecting interpersonal relationships in multi-occupant households. In this study, we conduct 13 semi-structured interviews and survey 508 individuals from a variety of backgrounds to discover and categorize how consumer IoT devices are affecting interpersonal relationships in the United States. We highlight several themes, providing exploratory data about the pervasiveness of interpersonal costs and benefits of consumer IoT devices. These results inform follow-up studies and design priorities for future IoT technologies to amplify positive and reduce negative interpersonal effects.

8.5CRMay 7, 2018

Security and Privacy Analyses of Internet of Things Children's Toys

Gordon Chu, Noah Apthorpe, Nick Feamster

This paper investigates the security and privacy of Internet-connected children's smart toys through case studies of three commercially-available products. We conduct network and application vulnerability analyses of each toy using static and dynamic analysis techniques, including application binary decompilation and network monitoring. We discover several publicly undisclosed vulnerabilities that violate the Children's Online Privacy Protection Rule (COPPA) as well as the toys' individual privacy policies. These vulnerabilities, especially security flaws in network communications with first-party servers, are indicative of a disconnect between many IoT toy developers and security and privacy best practices despite increased attention to Internet-connected toy hacking risks.

8.5CRMay 7, 2018

Detecting Compressed Cleartext Traffic from Consumer Internet of Things Devices

Daniel Hahn, Noah Apthorpe, Nick Feamster

Data encryption is the primary method of protecting the privacy of consumer device Internet communications from network observers. The ability to automatically detect unencrypted data in network traffic is therefore an essential tool for auditing Internet-connected devices. Existing methods identify network packets containing cleartext but cannot differentiate packets containing encrypted data from packets containing compressed unencrypted data, which can be easily recovered by reversing the compression algorithm. This makes it difficult for consumer protection advocates to identify devices that risk user privacy by sending sensitive data in a compressed unencrypted format. Here, we present the first technique to automatically distinguish encrypted from compressed unencrypted network transmissions on a per-packet basis. We apply three machine learning models and achieve a maximum 66.9% accuracy with a convolutional neural network trained on raw packet data. This result is a baseline for this previously unstudied machine learning problem, which we hope will motivate further attention and accuracy improvements. To facilitate continuing research on this topic, we have made our training and test datasets available to the public.

13.2CRMar 27, 2018

Cleartext Data Transmissions in Consumer IoT Medical Devices

Daniel Wood, Noah Apthorpe, Nick Feamster

This paper introduces a method to capture network traffic from medical IoT devices and automatically detect cleartext information that may reveal sensitive medical conditions and behaviors. The research follows a three-step approach involving traffic collection, cleartext detection, and metadata analysis. We analyze four popular consumer medical IoT devices, including one smart medical device that leaks sensitive health information in cleartext. We also present a traffic capture and analysis system that seamlessly integrates with a home network and offers a user-friendly interface for consumers to monitor and visualize data transmissions of IoT devices in their homes.

27.3CRAug 16, 2017

Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic

Noah Apthorpe, Dillon Reisman, Srikanth Sundaresan et al.

The growing market for smart home IoT devices promises new conveniences for consumers while presenting new challenges for preserving privacy within the home. Many smart home devices have always-on sensors that capture users' offline activities in their living spaces and transmit information about these activities on the Internet. In this paper, we demonstrate that an ISP or other network observer can infer privacy sensitive in-home activities by analyzing Internet traffic from smart homes containing commercially-available IoT devices even when the devices use encryption. We evaluate several strategies for mitigating the privacy risks associated with smart home device traffic, including blocking, tunneling, and rate-shaping. Our experiments show that traffic shaping can effectively and practically mitigate many privacy risks associated with smart home IoT devices. We find that 40KB/s extra bandwidth usage is enough to protect user activities from a passive network adversary. This bandwidth cost is well within the Internet speed limits and data caps for many smart homes.

12.3CRMay 18, 2017

Closing the Blinds: Four Strategies for Protecting Smart Home Privacy from Network Observers

Noah Apthorpe, Dillon Reisman, Nick Feamster

The growing market for smart home IoT devices promises new conveniences for consumers while presenting novel challenges for preserving privacy within the home. Specifically, Internet service providers or neighborhood WiFi eavesdroppers can measure Internet traffic rates from smart home devices and infer consumers' private in-home behaviors. Here we propose four strategies that device manufacturers and third parties can take to protect consumers from side-channel traffic rate privacy threats: 1) blocking traffic, 2) concealing DNS, 3) tunneling traffic, and 4) shaping and injecting traffic. We hope that these strategies, and the implementation nuances we discuss, will provide a foundation for the future development of privacy-sensitive smart homes.

24.2CRMay 18, 2017

A Smart Home is No Castle: Privacy Vulnerabilities of Encrypted IoT Traffic

Noah Apthorpe, Dillon Reisman, Nick Feamster

The increasing popularity of specialized Internet-connected devices and appliances, dubbed the Internet-of-Things (IoT), promises both new conveniences and new privacy concerns. Unlike traditional web browsers, many IoT devices have always-on sensors that constantly monitor fine-grained details of users' physical environments and influence the devices' network communications. Passive network observers, such as Internet service providers, could potentially analyze IoT network traffic to infer sensitive details about users. Here, we examine four IoT smart home devices (a Sense sleep monitor, a Nest Cam Indoor security camera, a WeMo switch, and an Amazon Echo) and find that their network traffic rates can reveal potentially sensitive user interactions even when the traffic is encrypted. These results indicate that a technological solution is needed to protect IoT device owner privacy, and that IoT-specific concerns must be considered in the ongoing policy debate around ISP data collection and usage.

4.3NCJun 23, 2016Code

Automatic Neuron Detection in Calcium Imaging Data Using Convolutional Networks

Noah J. Apthorpe, Alexander J. Riordan, Rob E. Aguilar et al.

Calcium imaging is an important technique for monitoring the activity of thousands of neurons simultaneously. As calcium imaging datasets grow in size, automated detection of individual neurons is becoming important. Here we apply a supervised learning approach to this problem and show that convolutional networks can achieve near-human accuracy and superhuman speed. Accuracy is superior to the popular PCA/ICA method based on precision and recall relative to ground truth annotation by a human expert. These results suggest that convolutional networks are an efficient and flexible tool for the analysis of large-scale calcium imaging data.