Serena Wang

h-index14

7papers

193citations

Novelty49%

AI Score34

Ranked #112,734 of 194,257 authors (top 58%)#24,802 in LG (top 62%)

7 Papers

9.6LGJun 13, 2022

Robust Distillation for Worst-class Performance

Serena Wang, Harikrishna Narasimhan, Yichen Zhou et al.

Knowledge distillation has proven to be an effective technique in improving the performance a student model using predictions from a teacher model. However, recent work has shown that gains in average efficiency are not uniform across subgroups in the data, and in particular can often come at the cost of accuracy on rare subgroups and classes. To preserve strong performance across classes that may follow a long-tailed distribution, we develop distillation techniques that are tailored to improve the student's worst-class performance. Specifically, we introduce robust optimization objectives in different combinations for the teacher and student, and further allow for training with any tradeoff between the overall accuracy and the robust worst-class objective. We show empirically that our robust distillation techniques not only achieve better worst-class performance, but also lead to Pareto improvement in the tradeoff between overall performance and worst-class performance compared to other baseline methods. Theoretically, we provide insights into what makes a good teacher when the goal is to train a robust student.

4.5AISep 8, 2022

Lost in Translation: Reimagining the Machine Learning Life Cycle in Education

Lydia T. Liu, Serena Wang, Tolani Britton et al.

Machine learning (ML) techniques are increasingly prevalent in education, from their use in predicting student dropout, to assisting in university admissions, and facilitating the rise of MOOCs. Given the rapid growth of these novel uses, there is a pressing need to investigate how ML techniques support long-standing education principles and goals. In this work, we shed light on this complex landscape drawing on qualitative insights from interviews with education experts. These interviews comprise in-depth evaluations of ML for education (ML4Ed) papers published in preeminent applied ML conferences over the past decade. Our central research goal is to critically examine how the stated or implied education and societal objectives of these papers are aligned with the ML problems they tackle. That is, to what extent does the technical problem formulation, objectives, approach, and interpretation of results align with the education problem at hand. We find that a cross-disciplinary gap exists and is particularly salient in two parts of the ML life cycle: the formulation of an ML problem from education goals and the translation of predictions to interventions. We use these insights to propose an extended ML life cycle, which may also apply to the use of ML in other domains. Our work joins a growing number of meta-analytical studies across education and ML research, as well as critical analyses of the societal impact of ML. Specifically, it fills a gap between the prevailing technical understanding of machine learning and the perspective of education researchers working with students and in policy.

2.3GTMay 31, 2025

The Disparate Effects of Partial Information in Bayesian Strategic Learning

Srikanth Avasarala, Serena Wang, Juba Ziani

We study how partial information about scoring rules affects fairness in strategic learning settings. In strategic learning, a learner deploys a scoring rule, and agents respond strategically by modifying their features -- at some cost -- to improve their outcomes. However, in our work, agents do not observe the scoring rule directly; instead, they receive a noisy signal of said rule. We consider two different agent models: (i) naive agents, who take the noisy signal at face value, and (ii) Bayesian agents, who update a prior belief based on the signal. Our goal is to understand how disparities in outcomes arise between groups that differ in their costs of feature modification, and how these disparities vary with the level of transparency of the learner's rule. For naive agents, we show that utility disparities can grow unboundedly with noise, and that the group with lower costs can, perhaps counter-intuitively, be disproportionately harmed under limited transparency. In contrast, for Bayesian agents, disparities remain bounded. We provide a full characterization of disparities across groups as a function of the level of transparency and show that they can vary non-monotonically with noise; in particular, disparities are often minimized at intermediate levels of transparency. Finally, we extend our analysis to settings where groups differ not only in cost, but also in prior beliefs, and study how this asymmetry influences fairness.

5.3LGMay 24, 2023

Operationalizing Counterfactual Metrics: Incentives, Ranking, and Information Asymmetry

Serena Wang, Stephen Bates, P. M. Aronow et al.

From the social sciences to machine learning, it has been well documented that metrics to be optimized are not always aligned with social welfare. In healthcare, Dranove et al. (2003) showed that publishing surgery mortality metrics actually harmed the welfare of sicker patients by increasing provider selection behavior. We analyze the incentive misalignments that arise from such average treated outcome metrics, and show that the incentives driving treatment decisions would align with maximizing total patient welfare if the metrics (i) accounted for counterfactual untreated outcomes and (ii) considered total welfare instead of averaging over treated patients. Operationalizing this, we show how counterfactual metrics can be modified to behave reasonably in patient-facing ranking systems. Extending to realistic settings when providers observe more about patients than the regulatory agencies do, we bound the decay in performance by the degree of information asymmetry between principal and agent. In doing so, our model connects principal-agent information asymmetry with unobserved heterogeneity in causal inference.

13.1LGMar 30, 2021

Multi-Source Causal Inference Using Control Variates

Wenshuo Guo, Serena Wang, Peng Ding et al.

While many areas of machine learning have benefited from the increasing availability of large and varied datasets, the benefit to causal inference has been limited given the strong assumptions needed to ensure identifiability of causal effects; these are often not satisfied in real-world datasets. For example, many large observational datasets (e.g., case-control studies in epidemiology, click-through data in recommender systems) suffer from selection bias on the outcome, which makes the average treatment effect (ATE) unidentifiable. We propose a general algorithm to estimate causal effects from \emph{multiple} data sources, where the ATE may be identifiable only in some datasets but not others. The key idea is to construct control variates using the datasets in which the ATE is not identifiable. We show theoretically that this reduces the variance of the ATE estimate. We apply this framework to inference from observational data under outcome selection bias, assuming access to an auxiliary small dataset from which we can obtain a consistent estimate of the ATE. We construct a control variate by taking the difference of the odds ratio estimates from the two datasets. Across simulations and two case studies with real data, we show that this control variate can significantly reduce the variance of the ATE estimate.

28.7LGFeb 21, 2020Code

Robust Optimization for Fairness with Noisy Protected Groups

Serena Wang, Wenshuo Guo, Harikrishna Narasimhan et al.

Many existing fairness criteria for machine learning involve equalizing some metric across protected groups such as race or gender. However, practitioners trying to audit or enforce such group-based criteria can easily face the problem of noisy or biased protected group information. First, we study the consequences of naively relying on noisy protected group labels: we provide an upper bound on the fairness violations on the true groups G when the fairness criteria are satisfied on noisy groups $\hat{G}$. Second, we introduce two new approaches using robust optimization that, unlike the naive approach of only relying on $\hat{G}$, are guaranteed to satisfy fairness criteria on the true protected groups G while minimizing a training objective. We provide theoretical guarantees that one such approach converges to an optimal feasible solution. Using two case studies, we show empirically that the robust approaches achieve better true group fairness guarantees than the naive approach.

10.1LGJan 31, 2020Code

Deontological Ethics By Monotonicity Shape Constraints

Serena Wang, Maya Gupta

We demonstrate how easy it is for modern machine-learned systems to violate common deontological ethical principles and social norms such as "favor the less fortunate," and "do not penalize good attributes." We propose that in some cases such ethical principles can be incorporated into a machine-learned model by adding shape constraints that constrain the model to respond only positively to relevant inputs. We analyze the relationship between these deontological constraints that act on individuals and the consequentialist group-based fairness goals of one-sided statistical parity and equal opportunity. This strategy works with sensitive attributes that are Boolean or real-valued such as income and age, and can help produce more responsible and trustworthy AI.