CRDec 10, 2018
The Effectiveness of Privacy Enhancing Technologies against FingerprintingAmit Datta, Jianan Lu, Michael Carl Tschantz
We measure how effective Privacy Enhancing Technologies (PETs) are at protecting users from website fingerprinting. Our measurements use both experimental and observational methods. Experimental methods allow control, precision, and use on new PETs that currently lack a user base. Observational methods enable scale and drawing from the browsers currently in real-world use. By applying experimentally created models of a PET's behavior to an observational data set, our novel hybrid method offers the best of both worlds. We find the Tor Browser Bundle to be the most effective PET amongst the set we tested. We find that some PETs have inconsistent behaviors, which can do more harm than good.
CRAug 27, 2014
Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and DiscriminationAmit Datta, Michael Carl Tschantz, Anupam Datta
To partly address people's concerns over web tracking, Google has created the Ad Settings webpage to provide information about and some choice over the profiles Google creates on users. We present AdFisher, an automated tool that explores how user behaviors, Google's ads, and Ad Settings interact. AdFisher can run browser-based experiments and analyze data using machine learning and significance tests. Our tool uses a rigorous experimental design and statistical analysis to ensure the statistical soundness of our results. We use AdFisher to find that the Ad Settings was opaque about some features of a user's profile, that it does provide some choice on ads, and that these choices can lead to seemingly discriminatory ads. In particular, we found that visiting webpages associated with substance abuse changed the ads shown but not the settings page. We also found that setting the gender to female resulted in getting fewer instances of an ad related to high paying jobs than setting it to male. We cannot determine who caused these findings due to our limited visibility into the ad ecosystem, which includes Google, advertisers, websites, and users. Nevertheless, these results can form the starting point for deeper investigations by either the companies themselves or by regulatory bodies.
CRMay 10, 2014
A Methodology for Information Flow ExperimentsMichael Carl Tschantz, Amit Datta, Anupam Datta et al.
Information flow analysis has largely ignored the setting where the analyst has neither control over nor a complete model of the analyzed system. We formalize such limited information flow analyses and study an instance of it: detecting the usage of data by websites. We prove that these problems are ones of causal inference. Leveraging this connection, we push beyond traditional information flow analysis to provide a systematic methodology based on experimental science and statistical analysis. Our methodology allows us to systematize prior works in the area viewing them as instances of a general approach. Our systematic study leads to practical advice for improving work on detecting data usage, a previously unformalized area. We illustrate these concepts with a series of experiments collecting data on the use of information by websites, which we statistically analyze.