Jason Huang

2papers

2 Papers

LGJun 2, 2020
Designing Differentially Private Estimators in High Dimensions

Aditya Dhar, Jason Huang

We study differentially private mean estimation in a high-dimensional setting. Existing differential privacy techniques applied to large dimensions lead to computationally intractable problems or estimators with excessive privacy loss. Recent work in high-dimensional robust statistics has identified computationally tractable mean estimation algorithms with asymptotic dimension-independent error guarantees. We incorporate these results to develop a strict bound on the global sensitivity of the robust mean estimator. This yields a computationally tractable algorithm for differentially private mean estimation in high dimensions with dimension-independent privacy loss. Finally, we show on synthetic data that our algorithm significantly outperforms classic differential privacy methods, overcoming barriers to high-dimensional differential privacy.

CLApr 26, 2020
Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias

Jesse Vig, Sebastian Gehrmann, Yonatan Belinkov et al.

Common methods for interpreting neural models in natural language processing typically examine either their structure or their behavior, but not both. We propose a methodology grounded in the theory of causal mediation analysis for interpreting which parts of a model are causally implicated in its behavior. It enables us to analyze the mechanisms by which information flows from input to output through various model components, known as mediators. We apply this methodology to analyze gender bias in pre-trained Transformer language models. We study the role of individual neurons and attention heads in mediating gender bias across three datasets designed to gauge a model's sensitivity to gender bias. Our mediation analysis reveals that gender bias effects are (i) sparse, concentrated in a small part of the network; (ii) synergistic, amplified or repressed by different components; and (iii) decomposable into effects flowing directly from the input and indirectly through the mediators.