Masanari Kimura

h-index43

9papers

50citations

Novelty30%

AI Score24

Ranked #171,130 of 194,257 authors (top 88%)#37,185 in LG (top 93%)

9 Papers

3.8MLJun 22, 2022

Information Geometry of Dropout Training

Masanari Kimura, Hideitsu Hino

Dropout is one of the most popular regularization techniques in neural network training. Because of its power and simplicity of idea, dropout has been analyzed extensively and many variants have been proposed. In this paper, several properties of dropout are discussed in a unified manner from the viewpoint of information geometry. We showed that dropout flattens the model manifold and that their regularization performance depends on the amount of the curvature. Then, we showed that dropout essentially corresponds to a regularization that depends on the Fisher information, and support this result from numerical experiments. Such a theoretical analysis of the technique from a different perspective is expected to greatly assist in the understanding of neural networks, which are still in their infancy.

10.2IRSep 4, 2024

A Fashion Item Recommendation Model in Hyperbolic Space

Ryotaro Shimizu, Yu Wang, Masanari Kimura et al.

In this work, we propose a fashion item recommendation model that incorporates hyperbolic geometry into user and item representations. Using hyperbolic space, our model aims to capture implicit hierarchies among items based on their visual data and users' purchase history. During training, we apply a multi-task learning framework that considers both hyperbolic and Euclidean distances in the loss function. Our experiments on three data sets show that our model performs better than previous models trained in Euclidean space only, confirming the effectiveness of our model. Our ablation studies show that multi-task learning plays a key role, and removing the Euclidean loss substantially deteriorates the model performance.

2.6CVOct 28, 2022

Fashion-Specific Attributes Interpretation via Dual Gaussian Visual-Semantic Embedding

Ryotaro Shimizu, Masanari Kimura, Masayuki Goto

Several techniques to map various types of components, such as words, attributes, and images, into the embedded space have been studied. Most of them estimate the embedded representation of target entity as a point in the projective space. Some models, such as Word2Gauss, assume a probability distribution behind the embedded representation, which enables the spread or variance of the meaning of embedded target components to be captured and considered in more detail. We examine the method of estimating embedded representations as probability distributions for the interpretation of fashion-specific abstract and difficult-to-understand terms. Terms, such as "casual," "adult-casual,'' "beauty-casual," and "formal," are extremely subjective and abstract and are difficult for both experts and non-experts to understand, which discourages users from trying new fashion. We propose an end-to-end model called dual Gaussian visual-semantic embedding, which maps images and attributes in the same projective space and enables the interpretation of the meaning of these terms by its broad applications. We demonstrate the effectiveness of the proposed method through multifaceted experiments involving image and attribute mapping, image retrieval and re-ordering techniques, and a detailed theoretical/analytical discussion of the distance measure included in the loss function.

1.4CVMar 2, 2022

GRASP EARTH: Intuitive Software for Discovering Changes on the Planet

Waku Hatakeyama, Shirou Kawakita, Ryohei Izawa et al.

Detecting changes on the Earth, such as urban development, deforestation, or natural disaster, is one of the research fields that is attracting a great deal of attention. One promising tool to solve these problems is satellite imagery. However, satellite images require huge amount of storage, therefore users are required to set Area of Interests first, which was not suitable for detecting potential areas for disaster or development. To tackle with this problem, we develop the novel tool, namely GRASP EARTH, which is the simple change detection application based on Google Earth Engine. GRASP EARTH allows us to handle satellite imagery easily and it has used for disaster monitoring and urban development monitoring.

5.9MLFeb 25, 2023

Generalization Bounds for Set-to-Set Matching with Negative Sampling

Masanari Kimura

The problem of matching two sets of multiple elements, namely set-to-set matching, has received a great deal of attention in recent years. In particular, it has been reported that good experimental results can be obtained by preparing a neural network as a matching function, especially in complex cases where, for example, each element of the set is an image. However, theoretical analysis of set-to-set matching with such black-box functions is lacking. This paper aims to perform a generalization error analysis in set-to-set matching to reveal the behavior of the model in that task.

17.6LGMar 26, 2024

On permutation-invariant neural networks

Masanari Kimura, Ryotaro Shimizu, Yuki Hirakawa et al.

Conventional machine learning algorithms have traditionally been designed under the assumption that input data follows a vector-based format, with an emphasis on vector-centric paradigms. However, as the demand for tasks involving set-based inputs has grown, there has been a paradigm shift in the research community towards addressing these challenges. In recent years, the emergence of neural network architectures such as Deep Sets and Transformers has presented a significant advancement in the treatment of set-based data. These architectures are specifically engineered to naturally accommodate sets as input, enabling more effective representation and processing of set structures. Consequently, there has been a surge of research endeavors dedicated to exploring and harnessing the capabilities of these architectures for various tasks involving the approximation of set functions. This comprehensive survey aims to provide an overview of the diverse problem settings and ongoing research efforts pertaining to neural networks that approximate set functions. By delving into the intricacies of these approaches and elucidating the associated challenges, the survey aims to equip readers with a comprehensive understanding of the field. Through this comprehensive perspective, we hope that researchers can gain valuable insights into the potential applications, inherent limitations, and future directions of set-based neural networks. Indeed, from this survey we gain two insights: i) Deep Sets and its variants can be generalized by differences in the aggregation function, and ii) the behavior of Deep Sets is sensitive to the choice of the aggregation function. From these observations, we show that Deep Sets, one of the well-known permutation-invariant neural networks, can be generalized in the sense of a quasi-arithmetic mean.

5.5MLMay 1, 2024

Geometric Insights into Focal Loss: Reducing Curvature for Enhanced Model Calibration

Masanari Kimura, Hiroki Naganuma

The key factor in implementing machine learning algorithms in decision-making situations is not only the accuracy of the model but also its confidence level. The confidence level of a model in a classification problem is often given by the output vector of a softmax function for convenience. However, these values are known to deviate significantly from the actual expected model confidence. This problem is called model calibration and has been studied extensively. One of the simplest techniques to tackle this task is focal loss, a generalization of cross-entropy by introducing one positive parameter. Although many related studies exist because of the simplicity of the idea and its formalization, the theoretical analysis of its behavior is still insufficient. In this study, our objective is to understand the behavior of focal loss by reinterpreting this function geometrically. Our analysis suggests that focal loss reduces the curvature of the loss surface in training the model. This indicates that curvature may be one of the essential factors in achieving model calibration. We design numerical experiments to support this conjecture to reveal the behavior of focal loss and the relationship between calibration performance and curvature.

2.6LGMay 23, 2024

Explaining Black-box Model Predictions via Two-level Nested Feature Attributions with Consistency Property

Yuya Yoshikawa, Masanari Kimura, Ryotaro Shimizu et al.

Techniques that explain the predictions of black-box machine learning models are crucial to make the models transparent, thereby increasing trust in AI systems. The input features to the models often have a nested structure that consists of high- and low-level features, and each high-level feature is decomposed into multiple low-level features. For such inputs, both high-level feature attributions (HiFAs) and low-level feature attributions (LoFAs) are important for better understanding the model's decision. In this paper, we propose a model-agnostic local explanation method that effectively exploits the nested structure of the input to estimate the two-level feature attributions simultaneously. A key idea of the proposed method is to introduce the consistency property that should exist between the HiFAs and LoFAs, thereby bridging the separate optimization problems for estimating them. Thanks to this consistency property, the proposed method can produce HiFAs and LoFAs that are both faithful to the black-box models and consistent with each other, using a smaller number of queries to the models. In experiments on image classification in multiple instance learning and text classification using language models, we demonstrate that the HiFAs and LoFAs estimated by the proposed method are accurate, faithful to the behaviors of the black-box models, and provide consistent explanations.

3.1LGAug 30, 2021Code

SHIFT15M: Fashion-specific dataset for set-to-set matching with several distribution shifts

Masanari Kimura, Takuma Nakamura, Yuki Saito

This paper addresses the problem of set-to-set matching, which involves matching two different sets of items based on some criteria, especially in the case of high-dimensional items like images. Although neural networks have been applied to solve this problem, most machine learning-based approaches assume that the training and test data follow the same distribution, which is not always true in real-world scenarios. To address this limitation, we introduce SHIFT15M, a dataset that can be used to evaluate set-to-set matching models when the distribution of data changes between training and testing. We conduct benchmark experiments that demonstrate the performance drop of naive methods due to distribution shift. Additionally, we provide software to handle the SHIFT15M dataset in a simple manner, with the URL for the software to be made available after publication of this manuscript. We believe proposed SHIFT15M dataset provide a valuable resource for evaluating set-to-set matching models under the distribution shift.