Peter Snyder

12papers

231citations

Novelty54%

AI Score29

Ranked #151,555 of 201,326 authors (top 75%)#4,285 in CR (top 59%)

12 Papers

CRDec 12, 2021Code

Pool-Party: Exploiting Browser Resource Pools as Side-Channels for Web Tracking

Peter Snyder, Soroush Karami, Arthur Edelstein et al.

We identify class of covert channels in browsers that are not mitigated by current defenses, which we call "pool-party" attacks. Pool-party attacks allow sites to create covert channels by manipulating limited-but-unpartitioned resource pools. These class of attacks have been known, but in this work we show that they are both more prevalent, more practical for exploitation, and allow exploitation in more ways, than previously identified. These covert channels have sufficient bandwidth to pass cookies and identifiers across site boundaries under practical and real-world conditions. We identify pool-party attacks in all popular browsers, and show they are practical cross-site tracking techniques (i.e., attacks take 0.6s in Chrome and Edge, and 7s in Firefox and Tor Browser). In this paper we make the following contributions: first, we describe pool-party covert channel attacks that exploit limits in application-layer resource pools in browsers. Second, we demonstrate that pool-party attacks are practical, and can be used to track users in all popular browsers; we also share open source implementations of the attack and evaluate them through a representative web crawl. Third, we show that in Gecko based-browsers (including the Tor Browser) pool-party attacks can also be used for cross-profile tracking (e.g., linking user behavior across normal and private browsing sessions). Finally, we discuss possible mitigation strategies and defenses

CRSep 21, 2021Code

STAR: Secret Sharing for Private Threshold Aggregation Reporting

Alex Davidson, Peter Snyder, E. B. Quirk et al.

Threshold aggregation reporting systems promise a practical, privacy-preserving solution for developers to learn how their applications are used "\emph{in-the-wild}". Unfortunately, proposed systems to date prove impractical for wide scale adoption, suffering from a combination of requiring: \emph{i)} prohibitive trust assumptions; \emph{ii)} high computation costs; or \emph{iii)} massive user bases. As a result, adoption of truly-private approaches has been limited to only a small number of enormous (and enormously costly) projects. In this work, we improve the state of private data collection by proposing $\mathsf{STAR}$, a highly efficient, easily deployable system for providing cryptographically-enforced $κ$-anonymity protections on user data collection. The $\mathsf{STAR}$ protocol is easy to implement and cheap to run, all while providing privacy properties similar to, or exceeding the current state-of-the-art. Measurements of our open-source implementation of $\mathsf{STAR}$ find that it is $1773\times$ quicker, requires $62.4\times$ less communication, and is $24\times$ cheaper to run than the existing state-of-the-art.

CRNov 2, 2020Code

There's No Trick, Its Just a Simple Trick: A Web-Compat and Privacy Improving Approach to Third-party Web Storage

Jordan Jueckstock, Peter Snyder, Shaown Sarker et al.

While much current web privacy research focuses on browser fingerprinting, the boring fact is that the majority of current third-party web tracking is conducted using traditional, persistent-state identifiers. One possible explanation for the privacy community's focus on fingerprinting is that to date browsers have faced a lose-lose dilemma when dealing with third-party stateful identifiers: block state in third-party frames and break a significant number of webpages, or allow state in third-party frames and enable pervasive tracking. The alternative, middle-ground solutions that have been deployed all trade privacy for compatibility, rely on manually curated lists, or depend on the user to manage state and state-access themselves. This work furthers privacy on the web by presenting a novel system for managing the lifetime of third-party storage, "page-length storage". We compare page-length storage to existing approaches for managing third-party state and find that page-length storage has the privacy protections of the most restrictive current option (i.e., blocking third-party storage) but web-compatibility properties mostly similar to the least restrictive option (i.e., allowing all third-party storage). This work further compares page-length storage to an alternative third-party storage partitioning scheme and finds that page-length storage provides superior privacy protections with comparable web-compatibility. We provide a dataset of the privacy and compatibility behaviors observed when applying the compared third-party storage strategies on a crawl of the Tranco 1k and the quantitative metrics used to demonstrate that page-length storage matches or surpasses existing approaches. Finally, we provide an open-source implementation of our page-length storage approach, implemented as patches against Chromium.

CRSep 14, 2021

Security, Privacy, and Decentralization in Web3

Philipp Winter, Anna Harbluk Lorimer, Peter Snyder et al.

Much of the recent excitement around decentralized finance (DeFi) comes from hopes that DeFi can be a secure, private, less centralized alternative to traditional finance systems. However, people moving to DeFi sites in hopes of improving their security and privacy may end up with less of both as recent attacks have demonstrated. In this work, we improve the understanding of DeFi by conducting the first Web measurements of the security, privacy, and decentralization properties of popular DeFi front ends. We find that DeFi applications -- or dapps -- suffer from the same security and privacy risks that frequent other parts of the Web but those risks are greatly exacerbated considering the money that is involved in DeFi. Our results show that a common tracker can observe user behavior on over 56% of websites we analyzed and many trackers on DeFi sites can trivially link a user's Ethereum address with PII (e.g., user name or demographic information), or phish users by initiating fake Ethereum transactions. Lastly, we establish that despite claims to the opposite, because of companies like Amazon and Cloudflare operating significant Web infrastructure, DeFi as a whole is considerably less decentralized than previously believed.

CRMay 25, 2020

Improving Web Content Blocking With Event-Loop-Turn Granularity JavaScript Signatures

Quan Chen, Peter Snyder, Ben Livshits et al.

Content blocking is an important part of a performant, user-serving, privacy respecting web. Most content blockers build trust labels over URLs. While useful, this approach has well understood shortcomings. Attackers may avoid detection by changing URLs or domains, bundling unwanted code with benign code, or inlining code in pages. The common flaw in existing approaches is that they evaluate code based on its delivery mechanism, not its behavior. In this work we address this problem with a system for generating signatures of the privacy-and-security relevant behavior of executed JavaScript. Our system considers script behavior during each turn on the JavaScript event loop. Focusing on event loop turns allows us to build signatures that are robust against code obfuscation, code bundling, URL modification, and other common evasions, as well as handle unique aspects of web applications. This work makes the following contributions to improving content blocking: First, implement a novel system to build per-event-loop-turn signatures of JavaScript code by instrumenting the Blink and V8 runtimes. Second, we apply these signatures to measure filter list evasion, by using EasyList and EasyPrivacy as ground truth and finding other code that behaves identically. We build ~2m signatures of privacy-and-security behaviors from 11,212 unique scripts blocked by filter lists, and find 3,589 more unique scripts including the same harmful code, affecting 12.48% of websites measured. Third, we taxonomize common filter list evasion techniques. Finally, we present defenses; filter list additions where possible, and a proposed, signature based system in other cases. We share the implementation of our signature-generation system, the dataset from applying our system to the Alexa 100K, and 586 AdBlock Plus compatible filter list rules to block instances of currently blocked code being moved to new URLs.

CROct 16, 2019

Filter List Generation for Underserved Regions

Alexander Sjosten, Peter Snyder, Antonio Pastor et al.

Filter lists play a large and growing role in protecting and assisting web users. The vast majority of popular filter lists are crowd-sourced, where a large number of people manually label resources related to undesirable web resources (e.g., ads, trackers, paywall libraries), so that they can be blocked by browsers and extensions. Because only a small percentage of web users participate in the generation of filter lists, a crowd-sourcing strategy works well for blocking either uncommon resources that appear on "popular" websites, or resources that appear on a large number of "unpopular" websites. A crowd-sourcing strategy will perform poorly for parts of the web with small "crowds", such as regions of the web serving languages with (relatively) few speakers. This work addresses this problem through the combination of two novel techniques: (i) deep browser instrumentation that allows for the accurate generation of request chains, in a way that is robust in situations that confuse existing measurement techniques, and (ii) an ad classifier that uniquely combines perceptual and page-context features to remain accurate across multiple languages. We apply our unique two-step filter list generation pipeline to three regions of the web that currently have poorly maintained filter lists: Sri Lanka, Hungary, and Albania. We generate new filter lists that complement existing filter lists. Our complementary lists block an additional 3,349 of ad and ad-related resources (1,771 unique) when applied to 6,475 pages targeting these three regions. We hope that this work can be part of an increased effort at ensuring that the security, privacy, and performance benefits of web resource blocking can be shared with all users, and not only those in dominant linguistic or economic regions.

CYFeb 18, 2019

Keeping out the Masses: Understanding the Popularity and Implications of Internet Paywalls

Panagiotis Papadopoulos, Peter Snyder, Dimitrios Athanasakis et al.

Funding the production of quality online content is a pressing problem for content producers. The most common funding method, online advertising, is rife with well-known performance and privacy harms, and an intractable subject-agent conflict: many users do not want to see advertisements, depriving the site of needed funding. Because of these negative aspects of advertisement-based funding, paywalls are an increasingly popular alternative for websites. This shift to a "pay-for-access" web is one that has potentially huge implications for the web and society. Instead of a system where information (nominally) flows freely, paywalls create a web where high quality information is available to fewer and fewer people, leaving the rest of the web users with less information, that might be also less accurate and of lower quality. Despite the potential significance of a move from an "advertising-but-open" web to a "paywalled" web, we find this issue understudied. This work addresses this gap in our understanding by measuring how widely paywalls have been adopted, what kinds of sites use paywalls, and the distribution of policies enforced by paywalls. A partial list of our findings include that (i) paywall use is accelerating (2x more paywalls every 6 months), (ii) paywall adoption differs by country (e.g. 18.75% in US, 12.69% in Australia), (iii) paywalls change how users interact with sites (e.g. higher bounce rates, less incoming links), (iv) the median cost of an annual paywall access is $108 per site, and (v) paywalls are in general trivial to circumvent. Finally, we present the design of a novel, automated system for detecting whether a site uses a paywall, through the combination of runtime browser instrumentation and repeated programmatic interactions with the site. We intend this classifier to augment future, longitudinal measurements of paywall use and behavior.

IRNov 8, 2018

SpeedReader: Reader Mode Made Fast and Private

Mohammad Ghasemisharif, Peter Snyder, Andrius Aucinas et al.

Most popular web browsers include "reader modes" that improve the user experience by removing un-useful page elements. Reader modes reformat the page to hide elements that are not related to the page's main content. Such page elements include site navigation, advertising related videos and images, and most JavaScript. The intended end result is that users can enjoy the content they are interested in, without distraction. In this work, we consider whether the "reader mode" can be widened to also provide performance and privacy improvements. Instead of its use as a post-render feature to clean up the clutter on a page we propose SpeedReader as an alternative multistep pipeline that is part of the rendering pipeline. Once the tool decides during the initial phase of a page load that a page is suitable for reader mode use, it directly applies document tree translation before the page is rendered. Based on our measurements, we believe that SpeedReader can be continuously enabled in order to drastically improve end-user experience, especially on slower mobile connections. Combined with our approach to predicting which pages should be rendered in reader mode with 91% accuracy, it achieves drastic speedups and bandwidth reductions of up to 27x and 84x respectively on average. We further find that our novel "reader mode" approach brings with it significant privacy improvements to users. Our approach effectively removes all commonly recognized trackers, issuing 115 fewer requests to third parties, and interacts with 64 fewer trackers on average, on transformed pages.

CROct 22, 2018

Who Filters the Filters: Understanding the Growth, Usefulness and Efficiency of Crowdsourced Ad Blocking

Peter Snyder, Antoine Vastel, Benjamin Livshits

Ad and tracking blocking extensions are popular tools for improving web performance, privacy and aesthetics. Content blocking extensions typically rely on filter lists to decide whether a web request is associated with tracking or advertising, and so should be blocked. Millions of web users rely on filter lists to protect their privacy and improve their browsing experience. Despite their importance, the growth and health of filter lists are poorly understood. Filter lists are maintained by a small number of contributors, who use a variety of undocumented heuristics to determine what rules should be included. Lists quickly accumulate rules, and rules are rarely removed. As a result, users' browsing experiences are degraded as the number of stale, dead or otherwise not useful rules increasingly dwarf the number of useful rules, with no attenuating benefit. An accumulation of "dead weight" rules also makes it difficult to apply filter lists on resource-limited mobile devices. This paper improves the understanding of crowdsourced filter lists by studying EasyList, the most popular filter list. We find that EasyList has grown from several hundred rules, to well over 60,000 rules, during its 9-year history. We measure how EasyList affects web browsing by applying EasyList to a sample of 10,000 websites. We find that 90.16% of the resource blocking rules in EasyList provide no benefit to users in common browsing scenarios. We further use our changes in EasyList application rates to provide a taxonomy of the ways advertisers evade EasyList rules. Finally, we propose optimizations for popular ad-blocking tools, that allow EasyList to be applied on performance constrained mobile devices, and improve desktop performance by 62.5%, while preserving over 99% of blocking coverage.

CYMay 22, 2018

AdGraph: A Graph-Based Approach to Ad and Tracker Blocking

Umar Iqbal, Peter Snyder, Shitong Zhu et al.

User demand for blocking advertising and tracking online is large and growing. Existing tools, both deployed and described in research, have proven useful, but lack either the completeness or robustness needed for a general solution. Existing detection approaches generally focus on only one aspect of advertising or tracking (e.g. URL patterns, code structure), making existing approaches susceptible to evasion. In this work we present AdGraph, a novel graph-based machine learning approach for detecting advertising and tracking resources on the web. AdGraph differs from existing approaches by building a graph representation of the HTML structure, network requests, and JavaScript behavior of a webpage, and using this unique representation to train a classifier for identifying advertising and tracking resources. Because AdGraph considers many aspects of the context a network request takes place in, it is less susceptible to the single-factor evasion techniques that flummox existing approaches. We evaluate AdGraph on the Alexa top-10K websites, and find that it is highly accurate, able to replicate the labels of human-generated filter lists with 95.33% accuracy, and can even identify many mistakes in filter lists. We implement AdGraph as a modification to Chromium. AdGraph adds only minor overhead to page loading and execution, and is actually faster than stock Chromium on 42% of websites and AdBlock Plus on 78% of websites. Overall, we conclude that AdGraph is both accurate enough and performant enough for online use, breaking comparable or fewer websites than popular filter list based approaches.

CRAug 28, 2017

Most Websites Don't Need to Vibrate: A Cost-Benefit Approach to Improving Browser Security

Peter Snyder, Cynthia Taylor, Chris Kanich

Modern web browsers have accrued an incredibly broad set of features since being invented for hypermedia dissemination in 1990. Many of these features benefit users by enabling new types of web applications. However, some features also bring risk to users' privacy and security, whether through implementation error, unexpected composition, or unintended use. Currently there is no general methodology for weighing these costs and benefits. Restricting access to only the features which are necessary for delivering desired functionality on a given website would allow users to enforce the principle of lease privilege on use of the myriad APIs present in the modern web browser. However, security benefits gained by increasing restrictions must be balanced against the risk of breaking existing websites. This work addresses this problem with a methodology for weighing the costs and benefits of giving websites default access to each browser feature. We model the benefit as the number of websites that require the feature for some user-visible benefit, and the cost as the number of CVEs, lines of code, and academic attacks related to the functionality. We then apply this methodology to 74 Web API standards implemented in modern browsers. We find that allowing websites default access to large parts of the Web API poses significant security and privacy risks, with little corresponding benefit. We also introduce a configurable browser extension that allows users to selectively restrict access to low-benefit, high-risk features on a per site basis. We evaluated our extension with two hardened browser configurations, and found that blocking 15 of the 74 standards avoids 52.0% of code paths related to previous CVEs, and 50.0% of implementation code identified by our metric, without affecting the functionality of 94.7% of measured websites.

NIMay 20, 2016

Browser Feature Usage on the Modern Web

Peter Snyder, Lara Ansari, Cynthia Taylor et al.

Modern web browsers are incredibly complex, with millions of lines of code and over one thousand JavaScript functions and properties available to website authors. This work investigates how these browser features are used on the modern, open web. We find that JavaScript features differ wildly in popularity, with over 50% of provided features never used in the Alexa 10k. We also look at how popular ad and tracking blockers change the distribution of features used by sites, and identify a set of approximately 10% of features that are disproportionately blocked (prevented from executing by these extensions at least 90% of the time they are used). We additionally find that in the presence of these blockers, over 83% of available features are executed on less than 1% of the most popular 10,000 websites. We additionally measure a variety of aspects of browser feature usage on the web, including how complex sites have become in terms of feature usage, how the length of time a browser feature has been in the browser relates to its usage on the web, and how many security vulnerabilities have been associated with related browser features.