Yuta Kawakami

LG
h-index18
8papers
13citations
Novelty51%
AI Score44

8 Papers

LGMay 26
Nonlinear Data Integration via Kernel Methods for Data Collaboration Analysis

Yamato Suetake, Yuta Kawakami, Shunnosuke Ikeda et al.

Collaborative analysis of decentralized confidential datasets is important, but direct sharing of original datasets is often restricted by privacy and institutional constraints. Data collaboration (DC) analysis transforms each dataset into privacy-preserving intermediate representations via party-specific obfuscation functions and integrates them into common collaboration representations using an anchor dataset. However, many existing DC analysis methods rely on linear transformations for data obfuscation and integration, which may increase reconstruction risk. Although nonlinear dimensionality reduction can mitigate this risk, conventional linear integration methods cannot accurately align intermediate representations produced by nonlinear transformations. Moreover, existing integration methods mainly minimize discrepancies among parties and do not explicitly incorporate geometric or target-variable information useful for downstream analysis. To overcome these limitations, we first formulate linear kernel integration (LKI) as a linear integration method and then kernelize it to obtain nonlinear kernel integration (NKI). NKI admits a globally optimal solution via kernel ridge regression and an eigenvalue problem. We also introduce graph regularization and a centering constraint so that the target representation can capture geometric and target-variable information useful for downstream analysis. Experiments on image classification tasks demonstrate that NKI improves classification accuracy over existing linear integration methods under nonlinear dimensionality reduction, with further gains from target-variable-aware graph regularization and centering. The results also show that dimensionality reduction choices substantially affect both classification accuracy and reconstruction risk.

AINov 13, 2025
Potential Outcome Rankings for Counterfactual Decision Making

Yuta Kawakami, Jin Tian

Counterfactual decision-making in the face of uncertainty involves selecting the optimal action from several alternatives using causal reasoning. Decision-makers often rank expected potential outcomes (or their corresponding utility and desirability) to compare the preferences of candidate actions. In this paper, we study new counterfactual decision-making rules by introducing two new metrics: the probabilities of potential outcome ranking (PoR) and the probability of achieving the best potential outcome (PoB). PoR reveals the most probable ranking of potential outcomes for an individual, and PoB indicates the action most likely to yield the top-ranked outcome for an individual. We then establish identification theorems and derive bounds for these metrics, and present estimation methods. Finally, we perform numerical experiments to illustrate the finite-sample properties of the estimators and demonstrate their application to a real-world dataset.

AIDec 19, 2024
Mediation Analysis for Probabilities of Causation

Yuta Kawakami, Jin Tian

Probabilities of causation (PoC) offer valuable insights for informed decision-making. This paper introduces novel variants of PoC-controlled direct, natural direct, and natural indirect probability of necessity and sufficiency (PNS). These metrics quantify the necessity and sufficiency of a treatment for producing an outcome, accounting for different causal pathways. We develop identification theorems for these new PoC measures, allowing for their estimation from observational data. We demonstrate the practical application of our results through an analysis of a real-world psychology dataset.

LGApr 22, 2024
New Solutions Based on the Generalized Eigenvalue Problem for the Data Collaboration Analysis

Yuta Kawakami, Yuichi Takano, Akira Imakura

In recent years, the accumulation of data across various institutions has garnered attention for the technology of confidential data analysis, which improves analytical accuracy by sharing data between multiple institutions while protecting sensitive information. Among these methods, Data Collaboration Analysis (DCA) is noted for its efficiency in terms of computational cost and communication load, facilitating data sharing and analysis across different institutions while safeguarding confidential information. However, existing optimization problems for determining the necessary collaborative functions have faced challenges, such as the optimal solution for the collaborative representation often being a zero matrix and the difficulty in understanding the process of deriving solutions. This research addresses these issues by formulating the optimization problem through the segmentation of matrices into column vectors and proposing a solution method based on the generalized eigenvalue problem. Additionally, we demonstrate methods for constructing collaborative functions more effectively through weighting and the selection of efficient algorithms suited to specific situations. Experiments using real-world datasets have shown that our proposed formulation and solution for the collaborative function optimization problem achieve superior predictive accuracy compared to existing methods.

MLFeb 21
Bounds and Identification of Joint Probabilities of Potential Outcomes and Observed Variables under Monotonicity Assumptions

Naoya Hashimoto, Yuta Kawakami, Jin Tian

Evaluating joint probabilities of potential outcomes and observed variables, and their linear combinations, is a fundamental challenge in causal inference. This paper addresses the bounding and identification of these probabilities in settings with discrete treatment and discrete ordinal outcome. We propose new families of monotonicity assumptions and formulate the bounding problem as a linear programming problem. We further introduce a new monotonicity assumption specifically to achieve identification. Finally, we present numerical experiments to validate our methods and demonstrate their application using real-world datasets.

MEMay 8, 2025
Decomposition of Probabilities of Causation with Two Mediators

Yuta Kawakami, Jin Tian

Mediation analysis for probabilities of causation (PoC) provides a fundamental framework for evaluating the necessity and sufficiency of treatment in provoking an event through different causal pathways. One of the primary objectives of causal mediation analysis is to decompose the total effect into path-specific components. In this study, we investigate the path-specific probability of necessity and sufficiency (PNS) to decompose the total PNS into path-specific components along distinct causal pathways between treatment and outcome, incorporating two mediators. We define the path-specific PNS for decomposition and provide an identification theorem. Furthermore, we conduct numerical experiments to assess the properties of the proposed estimators from finite samples and demonstrate their practical application using a real-world educational dataset.

MEMay 8, 2025
Moments of Causal Effects

Yuta Kawakami, Jin Tian

The moments of random variables are fundamental statistical measures for characterizing the shape of a probability distribution, encompassing metrics such as mean, variance, skewness, and kurtosis. Additionally, the product moments, including covariance and correlation, reveal the relationships between multiple random variables. On the other hand, the primary focus of causal inference is the evaluation of causal effects, which are defined as the difference between two potential outcomes. While traditional causal effect assessment focuses on the average causal effect, this work provides definitions, identification theorems, and bounds for moments and product moments of causal effects to analyze their distribution and relationships. We conduct experiments to illustrate the estimation of the moments of causal effects from finite samples and demonstrate their practical application using a real-world medical dataset.

LGJan 20, 2024
Identification and Estimation of Conditional Average Partial Causal Effects via Instrumental Variable

Yuta Kawakami, Manabu Kuroki, Jin Tian

There has been considerable recent interest in estimating heterogeneous causal effects. In this paper, we study conditional average partial causal effects (CAPCE) to reveal the heterogeneity of causal effects with continuous treatment. We provide conditions for identifying CAPCE in an instrumental variable setting. Notably, CAPCE is identifiable under a weaker assumption than required by a commonly used measure for estimating heterogeneous causal effects of continuous treatment. We develop three families of CAPCE estimators: sieve, parametric, and reproducing kernel Hilbert space (RKHS)-based, and analyze their statistical properties. We illustrate the proposed CAPCE estimators on synthetic and real-world data.