MEMay 6, 2020
Group Heterogeneity Assessment for Multilevel ModelsTopi Paananen, Alejandro Catalina, Paul-Christian Bürkner et al.
Many data sets contain an inherent multilevel structure, for example, because of repeated measurements of the same observational units. Taking this structure into account is critical for the accuracy and calibration of any statistical analysis performed on such data. However, the large number of possible model configurations hinders the use of multilevel models in practice. In this work, we propose a flexible framework for efficiently assessing differences between the levels of given grouping variables in the data. The assessed group heterogeneity is valuable in choosing the relevant group coefficients to consider in a multilevel model. Our empirical evaluations demonstrate that the framework can reliably identify relevant multilevel components in both simulated and real data sets.
MEOct 17, 2019
Uncertainty-aware Sensitivity Analysis Using Rényi DivergencesTopi Paananen, Michael Riis Andersen, Aki Vehtari
For nonlinear supervised learning models, assessing the importance of predictor variables or their interactions is not straightforward because it can vary in the domain of the variables. Importance can be assessed locally with sensitivity analysis using general methods that rely on the model's predictions or their derivatives. In this work, we extend derivative based sensitivity analysis to a Bayesian setting by differentiating the Rényi divergence of a model's predictive distribution. By utilising the predictive distribution instead of a point prediction, the model uncertainty is taken into account in a principled way. Our empirical results on simulated and real data sets demonstrate accurate and reliable identification of important variables and interaction effects compared to alternative methods.
COJun 20, 2019
Implicitly Adaptive Importance SamplingTopi Paananen, Juho Piironen, Paul-Christian Bürkner et al.
Adaptive importance sampling is a class of techniques for finding good proposal distributions for importance sampling. Often the proposal distributions are standard probability distributions whose parameters are adapted based on the mismatch between the current proposal and a target distribution. In this work, we present an implicit adaptive importance sampling method that applies to complicated distributions which are not available in closed form. The method iteratively matches the moments of a set of Monte Carlo draws to weighted moments based on importance weights. We apply the method to Bayesian leave-one-out cross-validation and show that it performs better than many existing parametric adaptive importance sampling methods while being computationally inexpensive.
MEDec 21, 2017
Variable selection for Gaussian processes via sensitivity analysis of the posterior predictive distributionTopi Paananen, Juho Piironen, Michael Riis Andersen et al.
Variable selection for Gaussian process models is often done using automatic relevance determination, which uses the inverse length-scale parameter of each input variable as a proxy for variable relevance. This implicitly determined relevance has several drawbacks that prevent the selection of optimal input variables in terms of predictive performance. To improve on this, we propose two novel variable selection methods for Gaussian process models that utilize the predictions of a full model in the vicinity of the training points and thereby rank the variables based on their predictive relevance. Our empirical results on synthetic and real world data sets demonstrate improved variable selection compared to automatic relevance determination in terms of variability and predictive performance.