Bayesian Kernelised Test of (In)dependence with Mixed-type Variables
This addresses a fundamental task in AI for handling mixed-type data, but appears incremental as it builds on existing kernel and Bayesian approaches.
The paper tackles the problem of assessing independence between mixed-type variables (e.g., text, image, sound) by proposing a Bayesian kernelised correlation test using a Dirichlet process model, and empirically demonstrates its effectiveness compared to other methods on various datasets.
A fundamental task in AI is to assess (in)dependence between mixed-type variables (text, image, sound). We propose a Bayesian kernelised correlation test of (in)dependence using a Dirichlet process model. The new measure of (in)dependence allows us to answer some fundamental questions: Based on data, are (mixed-type) variables independent? How likely is dependence/independence to hold? How high is the probability that two mixed-type variables are more than just weakly dependent? We theoretically show the properties of the approach, as well as algorithms for fast computation with it. We empirically demonstrate the effectiveness of the proposed method by analysing its performance and by comparing it with other frequentist and Bayesian approaches on a range of datasets and tasks with mixed-type variables.