MEOct 4, 2019
Donor's Deferral and Return Behavior: Partial Identification from a Regression Discontinuity Design with ManipulationEvan Rosenman, Karthik Rajkumar, Romain Gauriot et al.
Volunteer labor can temporarily yield lower benefits to charities than its costs. In such instances, organizations may wish to defer volunteer donations to a later date. Exploiting a discontinuity in blood donations' eligibility criteria, we show that deferring donors reduces their future volunteerism. In our setting, medical staff manipulates donors' reported hemoglobin levels over a threshold to facilitate donation. Such manipulation invalidates standard regression discontinuity design. To circumvent this issue, we propose a procedure for obtaining partial identification bounds where manipulation is present. Our procedure is applicable in various regression discontinuity settings where the running variable is manipulated.
MLJul 21, 2019
Some New Results for Poisson Binomial ModelsEvan Rosenman
We consider a problem of ecological inference, in which individual-level covariates are known, but labeled data is available only at the aggregate level. The intended application is modeling voter preferences in elections. In Rosenman and Viswanathan (2018), we proposed modeling individual voter probabilities via a logistic regression, and posing the problem as a maximum likelihood estimation for the parameter vector beta. The likelihood is a Poisson binomial, the distribution of the sum of independent but not identically distributed Bernoulli variables, though we approximate it with a heteroscedastic Gaussian for computational efficiency. Here, we extend the prior work by proving results about the existence of the MLE and the curvature of this likelihood, which is not log-concave in general. We further demonstrate the utility of our method on a real data example. Using data on voters in Morris County, NJ, we demonstrate that our approach outperforms other ecological inference methods in predicting a related, but known outcome: whether an individual votes.
MLFeb 4, 2018
Using Poisson Binomial GLMs to Reveal Voter PreferencesEvan Rosenman, Nitin Viswanathan
We present a new modeling technique for solving the problem of ecological inference, in which individual-level associations are inferred from labeled data available only at the aggregate level. We model aggregate count data as arising from the Poisson binomial, the distribution of the sum of independent but not identically distributed Bernoulli random variables. We relate individual-level probabilities to individual covariates using both a logistic regression and a neural network. A normal approximation is derived via the Lyapunov Central Limit Theorem, allowing us to efficiently fit these models on large datasets. We apply this technique to the problem of revealing voter preferences in the 2016 presidential election, fitting a model to a sample of over four million voters from the highly contested swing state of Pennsylvania. We validate the model at the precinct level via a holdout set, and at the individual level using weak labels, finding that the model is predictive and it learns intuitively reasonable associations.