Yoshiaki Kitazawa

ML
h-index1
3papers
3citations
Novelty38%
AI Score20

3 Papers

MLNov 14, 2022
Generalized Balancing Weights via Deep Neural Networks

Yoshiaki Kitazawa

Estimating causal effects from observational data is a central problem in many domains. A general approach is to balance covariates with weights such that the distribution of the data mimics randomization. We present generalized balancing weights, Neural Balancing Weights (NBW), to estimate the causal effects of an arbitrary mixture of discrete and continuous interventions. The weights were obtained through direct estimation of the density ratio between the source and balanced distributions by optimizing the variational representation of $f$-divergence. For this, we selected $α$-divergence as it presents efficient optimization because it has an estimator whose sample complexity is independent of its ground truth value and unbiased mini-batch gradients; moreover, it is advantageous for the vanishing-gradient problem. In addition, we provide the following two methods for estimating the balancing weights: improving the generalization performance of the balancing weights and checking the balance of the distribution changed by the weights. Finally, we discuss the sample size requirements for the weights as a general problem of a curse of dimensionality when balancing multidimensional data. Our study provides a basic approach for estimating the balancing weights of multidimensional data using variational $f$-divergences.

MLMar 8, 2022
Estimating the average causal effect of intervention in continuous variables using machine learning

Yoshiaki Kitazawa

The most widely discussed methods for estimating the Average Causal Effect/Average Treatment Effect are those for intervention in discrete binary variables whose value represents intervention/non-intervention groups. On the other hand, methods for intervening in continuous variables independent of data generating models have not been developed. In this study, we give a method for estimating the average causal effect for intervention in continuous variables that can be applied to data of any generating models as long as the causal effect is identifiable. The proposing method is independent of machine learning algorithms and preserves the identifiability of data.

MLFeb 3, 2024
Alpha-divergence loss function for neural density ratio estimation

Yoshiaki Kitazawa

Density ratio estimation (DRE) is a fundamental machine learning technique for capturing relationships between two probability distributions. State-of-the-art DRE methods estimate the density ratio using neural networks trained with loss functions derived from variational representations of $f$-divergences. However, existing methods face optimization challenges, such as overfitting due to lower-unbounded loss functions, biased mini-batch gradients, vanishing training loss gradients, and high sample requirements for Kullback--Leibler (KL) divergence loss functions. To address these issues, we focus on $α$-divergence, which provides a suitable variational representation of $f$-divergence. Subsequently, a novel loss function for DRE, the $α$-divergence loss function ($α$-Div), is derived. $α$-Div is concise but offers stable and effective optimization for DRE. The boundedness of $α$-divergence provides the potential for successful DRE with data exhibiting high KL-divergence. Our numerical experiments demonstrate the effectiveness of $α$-Div in optimization. However, the experiments also show that the proposed loss function offers no significant advantage over the KL-divergence loss function in terms of RMSE for DRE. This indicates that the accuracy of DRE is primarily determined by the amount of KL-divergence in the data and is less dependent on $α$-divergence.