Dan S. Wallach

CR
11papers
539citations
Novelty35%
AI Score22

11 Papers

CRDec 14, 2020
The Design and Implementation of a Verified File System with End-to-End Data Integrity

Daniel W. Song, Konstantinos Mamouras, Ang Chen et al.

Despite significant research and engineering efforts, many of today's important computer systems suffer from bugs. To increase the reliability of software systems, recent work has applied formal verification to certify the correctness of such systems, with recent successes including certified file systems and certified cryptographic protocols, albeit using quite different proof tactics and toolchains. Unifying these concepts, we present the first certified file system that uses cryptographic primitives to protect itself against tampering. Our certified file system defends against adversaries that might wish to tamper with the raw disk. Such an "untrusted storage" threat model captures the behavior of storage devices that might silently return erroneous bits as well as adversaries who might have limited access to a disk, perhaps while in transit. In this paper, we present IFSCQ, a certified cryptographic file system with strong integrity guarantees. IFSCQ combines and extends work on cryptographic file systems and formally certified file systems to prove that our design is correct. It is the first certified file system that is secure against strong adversaries that can maliciously corrupt on-disk data and metadata, including attempting to roll back the disk to earlier versions of valid data. IFSCQ achieves this by constructing a Merkle hash tree of the whole disk, and by proving that tampered disk blocks will always be detected if they ever occur. We demonstrate that IFSCQ runs with reasonable overhead while detecting several kinds of attacks.

CRDec 12, 2019
Investigating the effectiveness of web adblockers

Clayton Drazner, Nikola Đuza, Hugo Jonker et al.

We investigate adblocking filters and the extent to which websites and advertisers react when their content is impacted by these filters. We collected data daily from the Alexa Top-5000 web sites for 120 days, and from specific sites that newly appeared in filter lists for 140 days. By evaluating how long a filter rule triggers on a website, we can gauge how long it remains effective. We matched websites with both a regular adblocking filter list (EasyList) and with a specialized filter list that targets anti-adblocking logic (Nano Defender). From our data, we observe that the effectiveness of the EasyList adblocking filter decays a modest 0.13\% per day, and after around 80 days seems to stabilize. We found no evidence for any significant decay in effectiveness of the more specialized, but less widely used, anti-adblocking removal filters.

CRAug 5, 2019
On the security of ballot marking devices

Dan S. Wallach

A recent debate among election experts has considered whether electronic ballot marking devices (BMDs) have adequate security against the risks of malware. A malicious BMD might produce a printed ballot that disagrees with a voter's actual intent, with the hope that voters would be unlikely to detect this subterfuge. This essay considers how an election administrator can create reasonable auditing procedures to gain confidence that their fleet of BMDs is operating correctly, allowing voters to benefit from the usability and accessibility features of BMDs while the overall election still benefits from the same security and reliability properties we expect from hand-marked paper ballots.

CRJul 26, 2017
Public Evidence from Secret Ballots

Matthew Bernhard, Josh Benaloh, J. Alex Halderman et al.

Elections seem simple---aren't they just counting? But they have a unique, challenging combination of security and privacy requirements. The stakes are high; the context is adversarial; the electorate needs to be convinced that the results are correct; and the secrecy of the ballot must be ensured. And they have practical constraints: time is of the essence, and voting systems need to be affordable and maintainable, and usable by voters, election officials, and pollworkers. It is thus not surprising that voting is a rich research area spanning theory, applied cryptography, practical systems analysis, usable security, and statistics. Election integrity involves two key concepts: convincing evidence that outcomes are correct and privacy, which amounts to convincing assurance that there is no evidence about how any given person voted. These are obviously in tension. We examine how current systems walk this tightrope.

CRMay 2, 2017
Verification of STAR-Vote and Evaluation of FDR and ProVerif

Murat Moran, Dan S. Wallach

We present the first automated privacy analysis of STAR-Vote, a real world voting system design with sophisticated "end-to-end" cryptography, using FDR and ProVerif. We also evaluate the effectiveness of these tools. Despite the complexity of the voting system, we were able to verify that our abstracted formal model of STAR-Vote provides ballot-secrecy using both formal approaches. Notably, ProVerif is radically faster than FDR, making it more suitable for rapid iteration and refinement of the formal model.

CRFeb 23, 2015
An Empirical Study of Mobile Ad Targeting

Theodore Book, Dan S. Wallach

Advertising, long the financial mainstay of the web ecosystem, has become nearly ubiquitous in the world of mobile apps. While ad targeting on the web is fairly well understood, mobile ad targeting is much less studied. In this paper, we use empirical methods to collect a database of over 225,000 ads on 32 simulated devices hosting one of three distinct user profiles. We then analyze how the ads are targeted by correlating ads to potential targeting profiles using Bayes' rule and Pearson's chi squared test. This enables us to measure the prevalence of different forms of targeting. We find that nearly all ads show the effects of application- and time-based targeting, while we are able to identify location-based targeting in 43% of the ads and user-based targeting in 39%.

CRMar 24, 2014
The Mason Test: A Defense Against Sybil Attacks in Wireless Networks Without Trusted Authorities

Yue Liu, David R. Bild, Robert P. Dick et al.

Wireless networks are vulnerable to Sybil attacks, in which a malicious node poses as many identities in order to gain disproportionate influence. Many defenses based on spatial variability of wireless channels exist, but depend either on detailed, multi-tap channel estimation - something not exposed on commodity 802.11 devices - or valid RSSI observations from multiple trusted sources, e.g., corporate access points - something not directly available in ad hoc and delay-tolerant networks with potentially malicious neighbors. We extend these techniques to be practical for wireless ad hoc networks of commodity 802.11 devices. Specifically, we propose two efficient methods for separating the valid RSSI observations of behaving nodes from those falsified by malicious participants. Further, we note that prior signalprint methods are easily defeated by mobile attackers and develop an appropriate challenge-response defense. Finally, we present the Mason test, the first implementation of these techniques for ad hoc and delay-tolerant networks of commodity 802.11 devices. We illustrate its performance in several real-world scenarios.

CRJul 23, 2013
A Case of Collusion: A Study of the Interface Between Ad Libraries and their Apps

Theodore Book, Dan S. Wallach

A growing concern with advertisement libraries on Android is their ability to exfiltrate personal information from their host applications. While previous work has looked at the libraries' abilities to measure private information on their own, advertising libraries also include APIs through which a host application can deliberately leak private information about the user. This study considers a corpus of 114,000 apps. We reconstruct the APIs for 103 ad libraries used in the corpus, and study how the privacy leaking APIs from the top 20 ad libraries are used by the applications. Notably, we have found that app popularity correlates with privacy leakage; the marginal increase in advertising revenue, multiplied over a larger user base, seems to incentivize these app vendors to violate their users' privacy.

CRMay 1, 2013
Automated generation of web server fingerprints

Theodore Book, Martha Witick, Dan S. Wallach

In this paper, we demonstrate that it is possible to automatically generate fingerprints for various web server types using multifactor Bayesian inference on randomly selected servers on the Internet, without building an a priori catalog of server features or behaviors. This makes it possible to conclusively study web server distribution without relying on reported (and variable) version strings. We gather data by sending a collection of specialized requests to 110,000 live web servers. Using only the server response codes, we then train an algorithm to successfully predict server types independently of the server version string. In the process, we note several distinguishing features of current web infrastructure.

CRMar 4, 2013
Longitudinal Analysis of Android Ad Library Permissions

Theodore Book, Adam Pridgen, Dan S. Wallach

This paper investigates changes over time in the behavior of Android ad libraries. Taking a sample of 100,000 apps, we extract and classify the ad libraries. By considering the release dates of the applications that use a specific ad library version, we estimate the release date for the library, and thus build a chronological map of the permissions used by various ad libraries over time. We find that the use of most permissions has increased over the last several years, and that more libraries are able to use permissions that pose particular risks to user privacy and security.

CYMar 4, 2013
The Velocity of Censorship: High-Fidelity Detection of Microblog Post Deletions

Tao Zhu, David Phipps, Adam Pridgen et al.

Weibo and other popular Chinese microblogging sites are well known for exercising internal censorship, to comply with Chinese government requirements. This research seeks to quantify the mechanisms of this censorship: how fast and how comprehensively posts are deleted.Our analysis considered 2.38 million posts gathered over roughly two months in 2012, with our attention focused on repeatedly visiting "sensitive" users. This gives us a view of censorship events within minutes of their occurrence, albeit at a cost of our data no longer representing a random sample of the general Weibo population. We also have a larger 470 million post sampling from Weibo's public timeline, taken over a longer time period, that is more representative of a random sample. We found that deletions happen most heavily in the first hour after a post has been submitted. Focusing on original posts, not reposts/retweets, we observed that nearly 30% of the total deletion events occur within 5- 30 minutes. Nearly 90% of the deletions happen within the first 24 hours. Leveraging our data, we also considered a variety of hypotheses about the mechanisms used by Weibo for censorship, such as the extent to which Weibo's censors use retrospective keyword-based censorship, and how repost/retweet popularity interacts with censorship. We also used natural language processing techniques to analyze which topics were more likely to be censored.