IVAug 23, 2022
Aging prediction using deep generative model toward the development of preventive medicineHisaichi Shibata, Shouhei Hanaoka, Yukihiro Nomura et al.
From birth to death, we all experience surprisingly ubiquitous changes over time due to aging. If we can predict aging in the digital domain, that is, the digital twin of the human body, we would be able to detect lesions in their very early stages, thereby enhancing the quality of life and extending the life span. We observed that none of the previously developed digital twins of the adult human body explicitly trained longitudinal conversion rules between volumetric medical images with deep generative models, potentially resulting in poor prediction performance of, for example, ventricular volumes. Here, we establish a new digital twin of an adult human body that adopts longitudinally acquired head computed tomography (CT) images for training, enabling prediction of future volumetric head CT images from a single present volumetric head CT image. We, for the first time, adopt one of the three-dimensional flow-based deep generative models to realize this sequential three-dimensional digital twin. We show that our digital twin outperforms the latest methods of prediction of ventricular volumes in relatively short terms.
AIFeb 21, 2023
Playing the Werewolf game with artificial intelligence for language understandingHisaichi Shibata, Soichiro Miki, Yuta Nakamura
The Werewolf game is a social deduction game based on free natural language communication, in which players try to deceive others in order to survive. An important feature of this game is that a large portion of the conversations are false information, and the behavior of artificial intelligence (AI) in such a situation has not been widely investigated. The purpose of this study is to develop an AI agent that can play Werewolf through natural language conversations. First, we collected game logs from 15 human players. Next, we fine-tuned a Transformer-based pretrained language model to construct a value network that can predict a posterior probability of winning a game at any given phase of the game and given a candidate for the next action. We then developed an AI agent that can interact with humans and choose the best voting target on the basis of its probability from the value network. Lastly, we evaluated the performance of the agent by having it actually play the game with human players. We found that our AI agent, Deep Wolf, could play Werewolf as competitively as average human players in a villager or a betrayer role, whereas Deep Wolf was inferior to human players in a werewolf or a seer role. These results suggest that current language models have the capability to suspect what others are saying, tell a lie, or detect lies in conversations.
CVDec 20, 2022
Local Differential Privacy Image Generation Using Flow-based Deep Generative ModelsHisaichi Shibata, Shouhei Hanaoka, Yang Cao et al.
Diagnostic radiologists need artificial intelligence (AI) for medical imaging, but access to medical images required for training in AI has become increasingly restrictive. To release and use medical images, we need an algorithm that can simultaneously protect privacy and preserve pathologies in medical images. To develop such an algorithm, here, we propose DP-GLOW, a hybrid of a local differential privacy (LDP) algorithm and one of the flow-based deep generative models (GLOW). By applying a GLOW model, we disentangle the pixelwise correlation of images, which makes it difficult to protect privacy with straightforward LDP algorithms for images. Specifically, we map images onto the latent vector of the GLOW model, each element of which follows an independent normal distribution, and we apply the Laplace mechanism to the latent vector. Moreover, we applied DP-GLOW to chest X-ray images to generate LDP images while preserving pathologies.
CLDec 22, 2023
Theory of Hallucinations based on EquivarianceHisaichi Shibata
This study aims to acquire knowledge for creating very large language models that are immune to hallucinations. Hallucinations in contemporary large language models are often attributed to a misunderstanding of real-world social relationships. Therefore, I hypothesize that very large language models capable of thoroughly grasping all these relationships will be free from hallucinations. Additionally, I propose that certain types of equivariant language models are adept at learning and understanding these relationships. Building on this, I have developed a specialized cross-entropy error function to create a hallucination scale for language models, which measures their extent of equivariance acquisition. Utilizing this scale, I tested language models for their ability to acquire character-level equivariance. In particular, I introduce and employ a novel technique based on T5 (Text To Text Transfer Transformer) that efficiently understands permuted input texts without the need for explicit dictionaries to convert token IDs (integers) to texts (strings). This T5 model demonstrated a moderate ability to acquire character-level equivariance. Additionally, I discovered scale laws that can aid in developing hallucination-free language models at the character level. This methodology can be extended to assess equivariance acquisition at the word level, paving the way for very large language models that can comprehensively understand relationships and, consequently, avoid hallucinations.
IVApr 9, 2021
X2CT-FLOW: Maximum a posteriori reconstruction using a progressive flow-based deep generative model for ultra sparse-view computed tomography in ultra low-dose protocolsHisaichi Shibata, Shouhei Hanaoka, Yukihiro Nomura et al.
Ultra sparse-view computed tomography (CT) algorithms can reduce radiation exposure of patients, but those algorithms lack an explicit cycle consistency loss minimization and an explicit log-likelihood maximization in testing. Here, we propose X2CT-FLOW for the maximum a posteriori (MAP) reconstruction of a three-dimensional (3D) chest CT image from a single or a few two-dimensional (2D) projection images using a progressive flow-based deep generative model, especially for ultra low-dose protocols. The MAP reconstruction can simultaneously optimize the cycle consistency loss and the log-likelihood. The proposed algorithm is built upon a newly developed progressive flow-based deep generative model, which is featured with exact log-likelihood estimation, efficient sampling, and progressive learning. We applied X2CT-FLOW to reconstruction of 3D chest CT images from biplanar projection images without noise contamination (assuming a standard-dose protocol) and with strong noise contamination (assuming an ultra low-dose protocol). With the standard-dose protocol, our images reconstructed from 2D projected images and 3D ground-truth CT images showed good agreement in terms of structural similarity (SSIM, 0.7675 on average), peak signal-to-noise ratio (PSNR, 25.89 dB on average), mean absolute error (MAE, 0.02364 on average), and normalized root mean square error (NRMSE, 0.05731 on average). Moreover, with the ultra low-dose protocol, our images reconstructed from 2D projected images and the 3D ground-truth CT images also showed good agreement in terms of SSIM (0.7008 on average), PSNR (23.58 dB on average), MAE (0.02991 on average), and NRMSE (0.07349 on average).
LGFeb 18, 2020
On the Matrix-Free Generation of Adversarial Perturbations for Black-Box AttacksHisaichi Shibata, Shouhei Hanaoka, Yukihiro Nomura et al.
In general, adversarial perturbations superimposed on inputs are realistic threats for a deep neural network (DNN). In this paper, we propose a practical generation method of such adversarial perturbation to be applied to black-box attacks that demand access to an input-output relationship only. Thus, the attackers generate such perturbation without invoking inner functions and/or accessing the inner states of a DNN. Unlike the earlier studies, the algorithm to generate the perturbation presented in this study requires much fewer query trials. Moreover, to show the effectiveness of the adversarial perturbation extracted, we experiment with a DNN for semantic segmentation. The result shows that the network is easily deceived with the perturbation generated than using uniformly distributed random noise with the same magnitude.