Understanding the 2016 US Presidential Election using ecological inference and distribution regression with census microdata
This provides insights into election dynamics for political scientists and analysts, but is incremental as it applies existing methods to new data.
The researchers tackled the problem of understanding demographic voting patterns in the 2016 US presidential election by combining census microdata with election results, using ecological inference with distribution regression to estimate support for Trump, Clinton, and other categories across novel demographic groups, and identifying predictive census variables for voting behavior.
We combine fine-grained spatially referenced census data with the vote outcomes from the 2016 US presidential election. Using this dataset, we perform ecological inference using distribution regression (Flaxman et al, KDD 2015) with a multinomial-logit regression so as to model the vote outcome Trump, Clinton, Other / Didn't vote as a function of demographic and socioeconomic features. Ecological inference allows us to estimate "exit poll" style results like what was Trump's support among white women, but for entirely novel categories. We also perform exploratory data analysis to understand which census variables are predictive of voting for Trump, voting for Clinton, or not voting for either. All of our methods are implemented in Python and R, and are available online for replication.