CVMar 5, 2023
Event-based Camera Simulation using Monte Carlo Path Tracing with Adaptive DenoisingYuta Tsuji, Tatsuya Yatagawa, Hiroyuki Kubo et al.
This paper presents an algorithm to obtain an event-based video from noisy frames given by physics-based Monte Carlo path tracing over a synthetic 3D scene. Given the nature of dynamic vision sensor (DVS), rendering event-based video can be viewed as a process of detecting the changes from noisy brightness values. We extend a denoising method based on a weighted local regression (WLR) to detect the brightness changes rather than applying denoising to every pixel. Specifically, we derive a threshold to determine the likelihood of event occurrence and reduce the number of times to perform the regression. Our method is robust to noisy video frames obtained from a few path-traced samples. Despite its efficiency, our method performs comparably to or even better than an approach that exhaustively denoises every frame.
CVJul 14, 2022
Deep Point-to-Plane Registration by Efficient Backpropagation for Error Minimizing FunctionTatsuya Yatagawa, Yutaka Ohtake, Hiromasa Suzuki
Traditional algorithms of point set registration minimizing point-to-plane distances often achieve a better estimation of rigid transformation than those minimizing point-to-point distances. Nevertheless, recent deep-learning-based methods minimize the point-to-point distances. In contrast to these methods, this paper proposes the first deep-learning-based approach to point-to-plane registration. A challenging part of this problem is that a typical solution for point-to-plane registration requires an iterative process of accumulating small transformations obtained by minimizing a linearized energy function. The iteration significantly increases the size of the computation graph needed for backpropagation and can slow down both forward and backward network evaluations. To solve this problem, we consider the estimated rigid transformation as a function of input point clouds and derive its analytic gradients using the implicit function theorem. The analytic gradient that we introduce is independent of how the error minimizing function (i.e., the rigid transformation) is obtained, thus allowing us to calculate both the rigid transformation and its gradient efficiently. We implement the proposed point-to-plane registration module over several previous methods that minimize point-to-point distances and demonstrate that the extensions outperform the base methods even with point clouds with noise and low-quality point normals estimated with local point distributions.
CVDec 2, 2025
Attention-guided reference point shifting for Gaussian-mixture-based partial point set registrationMizuki Kikkawa, Tatsuya Yatagawa, Yutaka Ohtake et al.
This study investigates the impact of the invariance of feature vectors for partial-to-partial point set registration under translation and rotation of input point sets, particularly in the realm of techniques based on deep learning and Gaussian mixture models (GMMs). We reveal both theoretical and practical problems associated with such deep-learning-based registration methods using GMMs, with a particular focus on the limitations of DeepGMR, a pioneering study in this line, to the partial-to-partial point set registration. Our primary goal is to uncover the causes behind such methods and propose a comprehensible solution for that. To address this, we introduce an attention-based reference point shifting (ARPS) layer, which robustly identifies a common reference point of two partial point sets, thereby acquiring transformation-invariant features. The ARPS layer employs a well-studied attention module to find a common reference point rather than the overlap region. Owing to this, it significantly enhances the performance of DeepGMR and its recent variant, UGMMReg. Furthermore, these extension models outperform even prior deep learning methods using attention blocks and Transformer to extract the overlap region or common reference points. We believe these findings provide deeper insights into registration methods using deep learning and GMMs.
CVMay 1, 2023Code
Learning Self-Prior for Mesh Inpainting Using Self-Supervised Graph Convolutional NetworksShota Hattori, Tatsuya Yatagawa, Yutaka Ohtake et al.
In this paper, we present a self-prior-based mesh inpainting framework that requires only an incomplete mesh as input, without the need for any training datasets. Additionally, our method maintains the polygonal mesh format throughout the inpainting process without converting the shape format to an intermediate one, such as a voxel grid, a point cloud, or an implicit function, which are typically considered easier for deep neural networks to process. To achieve this goal, we introduce two graph convolutional networks (GCNs): single-resolution GCN (SGCN) and multi-resolution GCN (MGCN), both trained in a self-supervised manner. Our approach refines a watertight mesh obtained from the initial hole filling to generate a complete output mesh. Specifically, we train the GCNs to deform an oversmoothed version of the input mesh into the expected complete shape. The deformation is described by vertex displacements, and the GCNs are supervised to obtain accurate displacements at vertices in real holes. To this end, we specify several connected regions of the mesh as fake holes, thereby generating meshes with various sets of fake holes. The correct displacements of vertices are known in these fake holes, thus enabling training GCNs with loss functions that assess the accuracy of vertex displacements. We demonstrate that our method outperforms traditional dataset-independent approaches and exhibits greater robustness compared with other deep-learning-based methods for shapes that infrequently appear in shape datasets. Our code and test data are available at https://github.com/astaka-pe/SeMIGCN.
CVAug 15, 2024
Monte Carlo Path Tracing and Statistical Event Detection for Event Camera SimulationYuichiro Manabe, Tatsuya Yatagawa, Shigeo Morishima et al.
This paper presents a novel event camera simulation system fully based on physically based Monte Carlo path tracing with adaptive path sampling. The adaptive sampling performed in the proposed method is based on a statistical technique, hypothesis testing for the hypothesis whether the difference of logarithmic luminances at two distant periods is significantly larger than a predefined event threshold. To this end, our rendering system collects logarithmic luminances rather than raw luminance in contrast to the conventional rendering system imitating conventional RGB cameras. Then, based on the central limit theorem, we reasonably assume that the distribution of the population mean of logarithmic luminance can be modeled as a normal distribution, allowing us to model the distribution of the difference of logarithmic luminance as a normal distribution. Then, using Student's t-test, we can test the hypothesis and determine whether to discard the null hypothesis for event non-occurrence. When we sample a sufficiently large number of path samples to satisfy the central limit theorem and obtain a clean set of events, our method achieves significant speed up compared to a simple approach of sampling paths uniformly at every pixel. To our knowledge, we are the first to simulate the behavior of event cameras in a physically accurate manner using an adaptive sampling technique in Monte Carlo path tracing, and we believe this study will contribute to the development of computer vision applications using event cameras.
CVJul 2, 2021
Deep Mesh Prior: Unsupervised Mesh Restoration using Graph Convolutional NetworksShota Hattori, Tatsuya Yatagawa, Yutaka Ohtake et al.
This paper addresses mesh restoration problems, i.e., denoising and completion, by learning self-similarity in an unsupervised manner. For this purpose, the proposed method, which we refer to as Deep Mesh Prior, uses a graph convolutional network on meshes to learn the self-similarity. The network takes a single incomplete mesh as input data and directly outputs the reconstructed mesh without being trained using large-scale datasets. Our method does not use any intermediate representations such as an implicit field because the whole process works on a mesh. We demonstrate that our unsupervised method performs equally well or even better than the state-of-the-art methods using large-scale datasets.
CVNov 30, 2018
FSNet: An Identity-Aware Generative Model for Image-based Face SwappingRyota Natsume, Tatsuya Yatagawa, Shigeo Morishima
This paper presents FSNet, a deep generative model for image-based face swapping. Traditionally, face-swapping methods are based on three-dimensional morphable models (3DMMs), and facial textures are replaced between the estimated three-dimensional (3D) geometries in two images of different individuals. However, the estimation of 3D geometries along with different lighting conditions using 3DMMs is still a difficult task. We herein represent the face region with a latent variable that is assigned with the proposed deep neural network (DNN) instead of facial textures. The proposed DNN synthesizes a face-swapped image using the latent variable of the face region and another image of the non-face region. The proposed method is not required to fit to the 3DMM; additionally, it performs face swapping only by feeding two face images to the proposed network. Consequently, our DNN-based face swapping performs better than previous approaches for challenging inputs with different face orientations and lighting conditions. Through several experiments, we demonstrated that the proposed method performs face swapping in a more stable manner than the state-of-the-art method, and that its results are compatible with the method thereof.
CVApr 10, 2018
RSGAN: Face Swapping and Editing using Face and Hair Representation in Latent SpacesRyota Natsume, Tatsuya Yatagawa, Shigeo Morishima
In this paper, we present an integrated system for automatically generating and editing face images through face swapping, attribute-based editing, and random face parts synthesis. The proposed system is based on a deep neural network that variationally learns the face and hair regions with large-scale face image datasets. Different from conventional variational methods, the proposed network represents the latent spaces individually for faces and hairs. We refer to the proposed network as region-separative generative adversarial network (RSGAN). The proposed network independently handles face and hair appearances in the latent spaces, and then, face swapping is achieved by replacing the latent-space representations of the faces, and reconstruct the entire face image with them. This approach in the latent space robustly performs face swapping even for images which the previous methods result in failure due to inappropriate fitting or the 3D morphable models. In addition, the proposed system can further edit face-swapped images with the same network by manipulating visual attributes or by composing them with randomly generated face or hair parts.