Theodore Book

CR
4papers
265citations
Novelty33%
AI Score20

4 Papers

CRFeb 23, 2015
An Empirical Study of Mobile Ad Targeting

Theodore Book, Dan S. Wallach

Advertising, long the financial mainstay of the web ecosystem, has become nearly ubiquitous in the world of mobile apps. While ad targeting on the web is fairly well understood, mobile ad targeting is much less studied. In this paper, we use empirical methods to collect a database of over 225,000 ads on 32 simulated devices hosting one of three distinct user profiles. We then analyze how the ads are targeted by correlating ads to potential targeting profiles using Bayes' rule and Pearson's chi squared test. This enables us to measure the prevalence of different forms of targeting. We find that nearly all ads show the effects of application- and time-based targeting, while we are able to identify location-based targeting in 43% of the ads and user-based targeting in 39%.

CRJul 23, 2013
A Case of Collusion: A Study of the Interface Between Ad Libraries and their Apps

Theodore Book, Dan S. Wallach

A growing concern with advertisement libraries on Android is their ability to exfiltrate personal information from their host applications. While previous work has looked at the libraries' abilities to measure private information on their own, advertising libraries also include APIs through which a host application can deliberately leak private information about the user. This study considers a corpus of 114,000 apps. We reconstruct the APIs for 103 ad libraries used in the corpus, and study how the privacy leaking APIs from the top 20 ad libraries are used by the applications. Notably, we have found that app popularity correlates with privacy leakage; the marginal increase in advertising revenue, multiplied over a larger user base, seems to incentivize these app vendors to violate their users' privacy.

CRMay 1, 2013
Automated generation of web server fingerprints

Theodore Book, Martha Witick, Dan S. Wallach

In this paper, we demonstrate that it is possible to automatically generate fingerprints for various web server types using multifactor Bayesian inference on randomly selected servers on the Internet, without building an a priori catalog of server features or behaviors. This makes it possible to conclusively study web server distribution without relying on reported (and variable) version strings. We gather data by sending a collection of specialized requests to 110,000 live web servers. Using only the server response codes, we then train an algorithm to successfully predict server types independently of the server version string. In the process, we note several distinguishing features of current web infrastructure.

CRMar 4, 2013
Longitudinal Analysis of Android Ad Library Permissions

Theodore Book, Adam Pridgen, Dan S. Wallach

This paper investigates changes over time in the behavior of Android ad libraries. Taking a sample of 100,000 apps, we extract and classify the ad libraries. By considering the release dates of the applications that use a specific ad library version, we estimate the release date for the library, and thus build a chronological map of the permissions used by various ad libraries over time. We find that the use of most permissions has increased over the last several years, and that more libraries are able to use permissions that pose particular risks to user privacy and security.