MLDec 18, 2025
DAG Learning from Zero-Inflated Count Data Using Continuous OptimizationNoriaki Sato, Marco Scutari, Shuichi Kawano et al.
We address network structure learning from zero-inflated count data by casting each node as a zero-inflated generalized linear model and optimizing a smooth, score-based objective under a directed acyclic graph constraint. Our Zero-Inflated Continuous Optimization (ZICO) approach uses node-wise likelihoods with canonical links and enforces acyclicity through a differentiable surrogate constraint combined with sparsity regularization. ZICO achieves superior performance with faster runtimes on simulated data. It also performs comparably to or better than common algorithms for reverse engineering gene regulatory networks. ZICO is fully vectorized and mini-batched, enabling learning on larger variable sets with practical runtimes in a wide range of domains.
MNNov 16, 2025
Practical Causal Evaluation Metrics for Biological NetworksNoriaki Sato, Marco Scutari, Shuichi Kawano et al.
Estimating causal networks from biological data is a critical step in systems biology. When evaluating the inferred network, assessing the networks based on their intervention effects is particularly important for downstream probabilistic reasoning and the identification of potential drug targets. In the context of gene regulatory network inference, biological databases are often used as reference sources. These databases typically describe relationships in a qualitative rather than quantitative manner. However, few evaluation metrics have been developed that take this qualitative nature into account. To address this, we developed a metric, the sign-augmented Structural Intervention Distance (sSID), and a weighted sSID that incorporates the net effects of the intervention. Through simulations and analyses of real transcriptomic datasets, we found that our proposed metrics could identify a different algorithm as optimal compared to conventional metrics, and the network selected by sSID had a superior performance in the classification task of clinical covariates using transcriptomic data. This suggests that sSID can distinguish networks that are structurally correct but functionally incorrect, highlighting its potential as a more biologically meaningful and practical evaluation metric.
MEApr 4, 2024
Multi-task learning via robust regularized clustering with non-convex group penaltiesAkira Okazaki, Shuichi Kawano
Multi-task learning (MTL) aims to improve estimation and prediction performance by sharing common information among related tasks. One natural assumption in MTL is that tasks are classified into clusters based on their characteristics. However, existing MTL methods based on this assumption often ignore outlier tasks that have large task-specific components or no relation to other tasks. To address this issue, we propose a novel MTL method called Multi-Task Learning via Robust Regularized Clustering (MTLRRC). MTLRRC incorporates robust regularization terms inspired by robust convex clustering, which is further extended to handle non-convex and group-sparse penalties. The extension allows MTLRRC to simultaneously perform robust task clustering and outlier task detection. The connection between the extended robust clustering and the multivariate M-estimator is also established. This provides an interpretation of the robustness of MTLRRC against outlier tasks. An efficient algorithm based on a modified alternating direction method of multipliers is developed for the estimation of the parameters. The effectiveness of MTLRRC is demonstrated through simulation studies and application to real data.
MEOct 18, 2021
A Bayesian approach to multi-task learning with network lassoKaito Shimamura, Shuichi Kawano
Network lasso is a method for solving a multi-task learning problem through the regularized maximum likelihood method. A characteristic of network lasso is setting a different model for each sample. The relationships among the models are represented by relational coefficients. A crucial issue in network lasso is to provide appropriate values for these relational coefficients. In this paper, we propose a Bayesian approach to solve multi-task learning problems by network lasso. This approach allows us to objectively determine the relational coefficients by Bayesian estimation. The effectiveness of the proposed method is shown in a simulation study and a real data analysis.
MLSep 6, 2020
Multilinear Common Component Analysis via Kronecker Product RepresentationKohei Yoshikawa, Shuichi Kawano
We consider the problem of extracting a common structure from multiple tensor datasets. For this purpose, we propose multilinear common component analysis (MCCA) based on Kronecker products of mode-wise covariance matrices. MCCA constructs a common basis represented by linear combinations of the original variables which loses as little information of the multiple tensor datasets. We also develop an estimation algorithm for MCCA that guarantees mode-wise global convergence. Numerical studies are conducted to show the effectiveness of MCCA.
MEMar 30, 2020
Variable fusion for Bayesian linear regression via spike-and-slab priorsShengyi Wu, Kaito Shimamura, Kohei Yoshikawa et al.
In linear regression models, fusion of coefficients is used to identify predictors having similar relationships with a response. This is called variable fusion. This paper presents a novel variable fusion method in terms of Bayesian linear regression models. We focus on hierarchical Bayesian models based on a spike-and-slab prior approach. A spike-and-slab prior is tailored to perform variable fusion. To obtain estimates of the parameters, we develop a Gibbs sampler for the parameters. Simulation studies and a real data analysis show that our proposed method achieves better performance than previous methods.
MLFeb 21, 2020
Sparse principal component regression via singular value decomposition approachShuichi Kawano
Principal component regression (PCR) is a two-stage procedure: the first stage performs principal component analysis (PCA) and the second stage constructs a regression model whose explanatory variables are replaced by principal components obtained by the first stage. Since PCA is performed by using only explanatory variables, the principal components have no information about the response variable. To address the problem, we propose a one-stage procedure for PCR in terms of singular value decomposition approach. Our approach is based upon two loss functions, a regression loss and a PCA loss, with sparse regularization. The proposed method enables us to obtain principal component loadings that possess information about both explanatory variables and a response variable. An estimation algorithm is developed by using alternating direction method of multipliers. We conduct numerical studies to show the effectiveness of the proposed method.
MLNov 20, 2019
Bayesian sparse convex clustering via global-local shrinkage priorsKaito Shimamura, Shuichi Kawano
Sparse convex clustering is to cluster observations and conduct variable selection simultaneously in the framework of convex clustering. Although a weighted $L_1$ norm is usually employed for the regularization term in sparse convex clustering, its use increases the dependence on the data and reduces the estimation accuracy if the sample size is not sufficient. To tackle these problems, this paper proposes a Bayesian sparse convex clustering method based on the ideas of Bayesian lasso and global-local shrinkage priors. We introduce Gibbs sampling algorithms for our method using scale mixtures of normal distributions. The effectiveness of the proposed methods is shown in simulation studies and a real data analysis.
MLOct 11, 2019
Sparse Reduced-Rank Regression for Simultaneous Rank and Variable Selection via Manifold OptimizationKohei Yoshikawa, Shuichi Kawano
We consider the problem of constructing a reduced-rank regression model whose coefficient parameter is represented as a singular value decomposition with sparse singular vectors. The traditional estimation procedure for the coefficient parameter often fails when the true rank of the parameter is high. To overcome this issue, we develop an estimation algorithm with rank and variable selection via sparse regularization and manifold optimization, which enables us to obtain an accurate estimation of the coefficient parameter even if the true rank of the coefficient parameter is high. Using sparse regularization, we can also select an optimal value of the rank. We conduct Monte Carlo experiments and real data analysis to illustrate the effectiveness of our proposed method.
MLSep 28, 2016
Sparse principal component regression for generalized linear modelsShuichi Kawano, Hironori Fujisawa, Toyoyuki Takada et al.
Principal component regression (PCR) is a widely used two-stage procedure: principal component analysis (PCA), followed by regression in which the selected principal components are regarded as new explanatory variables in the model. Note that PCA is based only on the explanatory variables, so the principal components are not selected using the information on the response variable. In this paper, we propose a one-stage procedure for PCR in the framework of generalized linear models. The basic loss function is based on a combination of the regression loss and PCA loss. An estimate of the regression parameter is obtained as the minimizer of the basic loss function with a sparse penalty. We call the proposed method sparse principal component regression for generalized linear models (SPCR-glm). Taking the two loss function into consideration simultaneously, SPCR-glm enables us to obtain sparse principal component loadings that are related to a response variable. However, a combination of loss functions may cause a parameter identification problem, but this potential problem is avoided by virtue of the sparse penalty. Thus, the sparse penalty plays two roles in this method. The parameter estimation procedure is proposed using various update algorithms with the coordinate descent algorithm. We apply SPCR-glm to two real datasets, doctor visits data and mouse consomic strain data. SPCR-glm provides more easily interpretable principal component (PC) scores and clearer classification on PC plots than the usual PCA.
MEFeb 16, 2016
Bayesian generalized fused lasso modeling via NEG distributionKaito Shimamura, Masao Ueki, Shuichi Kawano et al.
The fused lasso penalizes a loss function by the $L_1$ norm for both the regression coefficients and their successive differences to encourage sparsity of both. In this paper, we propose a Bayesian generalized fused lasso modeling based on a normal-exponential-gamma (NEG) prior distribution. The NEG prior is assumed into the difference of successive regression coefficients. The proposed method enables us to construct a more versatile sparse model than the ordinary fused lasso by using a flexible regularization term. We also propose a sparse fused algorithm to produce exact sparse solutions. Simulation studies and real data analyses show that the proposed method has superior performance to the ordinary fused lasso.
MLFeb 26, 2014
Sparse principal component regression with adaptive loadingShuichi Kawano, Hironori Fujisawa, Toyoyuki Takada et al.
Principal component regression (PCR) is a two-stage procedure that selects some principal components and then constructs a regression model regarding them as new explanatory variables. Note that the principal components are obtained from only explanatory variables and not considered with the response variable. To address this problem, we propose the sparse principal component regression (SPCR) that is a one-stage procedure for PCR. SPCR enables us to adaptively obtain sparse principal component loadings that are related to the response variable and select the number of principal components simultaneously. SPCR can be obtained by the convex optimization problem for each of parameters with the coordinate descent algorithm. Monte Carlo simulations and real data analyses are performed to illustrate the effectiveness of SPCR.
MEMar 20, 2012
Selection of tuning parameters in bridge regression models via Bayesian information criterionShuichi Kawano
We consider the bridge linear regression modeling, which can produce a sparse or non-sparse model. A crucial point in the model building process is the selection of adjusted parameters including a regularization parameter and a tuning parameter in bridge regression models. The choice of the adjusted parameters can be viewed as a model selection and evaluation problem. We propose a model selection criterion for evaluating bridge regression models in terms of Bayesian approach. This selection criterion enables us to select the adjusted parameters objectively. We investigate the effectiveness of our proposed modeling strategy through some numerical examples.