Julien Freudiger

CR
6papers
737citations
Novelty41%
AI Score24

6 Papers

LGFeb 16, 2021
Federated Evaluation and Tuning for On-Device Personalization: System Design & Applications

Matthias Paulik, Matt Seigel, Henry Mason et al.

We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other system has been described in literature that supports FL at scale. We include comparisons to that system to help discuss design decisions and attached trade-offs. Finally, we describe two specific large scale personalization use cases in detail to showcase the applicability of federated tuning to on-device personalization and to highlight application specific solutions.

MLDec 3, 2018
Protection Against Reconstruction and Its Applications in Private Federated Learning

Abhishek Bhowmick, John Duchi, Julien Freudiger et al.

In large-scale statistical learning, data collection and model fitting are moving increasingly toward peripheral devices---phones, watches, fitness trackers---away from centralized data collection. Concomitant with this rise in decentralized data are increasing challenges of maintaining privacy while allowing enough information to fit accurate, useful statistical models. This motivates local notions of privacy---most significantly, local differential privacy, which provides strong protections against sensitive data disclosures---where data is obfuscated before a statistician or learner can even observe it, providing strong protections to individuals' data. Yet local privacy as traditionally employed may prove too stringent for practical use, especially in modern high-dimensional statistical and machine learning problems. Consequently, we revisit the types of disclosures and adversaries against which we provide protections, considering adversaries with limited prior information and ensuring that with high probability, ensuring they cannot reconstruct an individual's data within useful tolerances. By reconceptualizing these protections, we allow more useful data release---large privacy parameters in local differential privacy---and we design new (minimax) optimal locally differentially private mechanisms for statistical learning problems for \emph{all} privacy levels. We thus present practicable approaches to large-scale locally private model training that were previously impossible, showing theoretically and empirically that we can fit large-scale image classification and language models with little degradation in utility.

CRFeb 18, 2015
Controlled Data Sharing for Collaborative Predictive Blacklisting

Julien Freudiger, Emiliano De Cristofaro, Alex Brito

Although sharing data across organizations is often advocated as a promising way to enhance cybersecurity, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to whether or not, and how much, to share. Using appropriate cryptographic tools, entities can estimate the benefits of collaboration and agree on what to share in a privacy-preserving way, without having to disclose their datasets. We focus on collaborative predictive blacklisting, i.e., forecasting attack sources based on one's logs and those contributed by other organizations. We study the impact of different sharing strategies by experimenting on a real-world dataset of two billion suspicious IP addresses collected from Dshield over two months. We find that controlled data sharing yields up to 105% accuracy improvement on average, while also reducing the false positive rate.

CRMay 6, 2014
What's the Gist? Privacy-Preserving Aggregation of User Profiles

Igor Bilogrevic, Julien Freudiger, Emiliano De Cristofaro et al.

Over the past few years, online service providers have started gathering increasing amounts of personal information to build user profiles and monetize them with advertisers and data brokers. Users have little control of what information is processed and are often left with an all-or-nothing decision between receiving free services or refusing to be profiled. This paper explores an alternative approach where users only disclose an aggregate model -- the "gist" -- of their data. We aim to preserve data utility and simultaneously provide user privacy. We show that this approach can be efficiently supported by letting users contribute encrypted and differentially-private data to an aggregator. The aggregator combines encrypted contributions and can only extract an aggregate model of the underlying data. We evaluate our framework on a dataset of 100,000 U.S. users obtained from the U.S. Census Bureau and show that (i) it provides accurate aggregates with as little as 100 users, (ii) it generates revenue for both users and data brokers, and (iii) its overhead is appreciably low.

CRMar 10, 2014
Privacy-Friendly Collaboration for Cyber Threat Mitigation

Julien Freudiger, Emiliano De Cristofaro, Alex Brito

Sharing of security data across organizational boundaries has often been advocated as a promising way to enhance cyber threat mitigation. However, collaborative security faces a number of important challenges, including privacy, trust, and liability concerns with the potential disclosure of sensitive data. In this paper, we focus on data sharing for predictive blacklisting, i.e., forecasting attack sources based on past attack information. We propose a novel privacy-enhanced data sharing approach in which organizations estimate collaboration benefits without disclosing their datasets, organize into coalitions of allied organizations, and securely share data within these coalitions. We study how different partner selection strategies affect prediction accuracy by experimenting on a real-world dataset of 2 billion IP addresses and observe up to a 105% prediction improvement.

CRSep 20, 2013
A Comparative Usability Study of Two-Factor Authentication

Emiliano De Cristofaro, Honglu Du, Julien Freudiger et al.

Two-factor authentication (2F) aims to enhance resilience of password-based authentication by requiring users to provide an additional authentication factor, e.g., a code generated by a security token. However, it also introduces non-negligible costs for service providers and requires users to carry out additional actions during the authentication process. In this paper, we present an exploratory comparative study of the usability of 2F technologies. First, we conduct a pre-study interview to identify popular technologies as well as contexts and motivations in which they are used. We then present the results of a quantitative study based on a survey completed by 219 Mechanical Turk users, aiming to measure the usability of three popular 2F solutions: codes generated by security tokens, one-time PINs received via email or SMS, and dedicated smartphone apps (e.g., Google Authenticator). We record contexts and motivations, and study their impact on perceived usability. We find that 2F technologies are overall perceived as usable, regardless of motivation and/or context of use. We also present an exploratory factor analysis, highlighting that three metrics -- ease-of-use, required cognitive efforts, and trustworthiness -- are enough to capture key factors affecting 2F usability.