Hiroyuki Kasai

LG
h-index3
33papers
471citations
Novelty46%
AI Score41

33 Papers

LGJul 9, 2022
Wasserstein Graph Distance Based on $L_1$-Approximated Tree Edit Distance between Weisfeiler-Lehman Subtrees

Zhongxi Fang, Jianming Huang, Xun Su et al.

The Weisfeiler-Lehman (WL) test is a widely used algorithm in graph machine learning, including graph kernels, graph metrics, and graph neural networks. However, it focuses only on the consistency of the graph, which means that it is unable to detect slight structural differences. Consequently, this limits its ability to capture structural information, which also limits the performance of existing models that rely on the WL test. This limitation is particularly severe for traditional metrics defined by the WL test, which cannot precisely capture slight structural differences. In this paper, we propose a novel graph metric called the Wasserstein WL Subtree (WWLS) distance to address this problem. Our approach leverages the WL subtree as structural information for node neighborhoods and defines node metrics using the $L_1$-approximated tree edit distance ($L_1$-TED) between WL subtrees of nodes. Subsequently, we combine the Wasserstein distance and the $L_1$-TED to define the WWLS distance, which can capture slight structural differences that may be difficult to detect using conventional metrics. We demonstrate that the proposed WWLS distance outperforms baselines in both metric validation and graph classification experiments.

LGMay 27, 2022
Block-coordinate Frank-Wolfe algorithm and convergence analysis for semi-relaxed optimal transport problem

Takumi Fukunaga, Hiroyuki Kasai

The optimal transport (OT) problem has been used widely for machine learning. It is necessary for computation of an OT problem to solve linear programming with tight mass-conservation constraints. These constraints prevent its application to large-scale problems. To address this issue, loosening such constraints enables us to propose the relaxed-OT method using a faster algorithm. This approach has demonstrated its effectiveness for applications. However, it remains slow. As a superior alternative, we propose a fast block-coordinate Frank-Wolfe (BCFW) algorithm for a convex semi-relaxed OT. Specifically, we prove their upper bounds of the worst convergence iterations, and equivalence between the linearization duality gap and the Lagrangian duality gap. Additionally, we develop two fast variants of the proposed BCFW. Numerical experiments have demonstrated that our proposed algorithms are effective for color transfer and surpass state-of-the-art algorithms. This report presents a short version of arXiv:2103.05857.

LGMay 27, 2022
On the Convergence of Semi-Relaxed Sinkhorn with Marginal Constraint and OT Distance Gaps

Takumi Fukunaga, Hiroyuki Kasai

This paper presents consideration of the Semi-Relaxed Sinkhorn (SR-Sinkhorn) algorithm for the semi-relaxed optimal transport (SROT) problem, which relaxes one marginal constraint of the standard OT problem. For evaluation of how the constraint relaxation affects the algorithm behavior and solution, it is vitally necessary to present the theoretical convergence analysis in terms not only of the functional value gap, but also of the marginal constraint gap as well as the OT distance gap. However, no existing work has addressed all analyses simultaneously. To this end, this paper presents a comprehensive convergence analysis for SR-Sinkhorn. After presenting the $ε$-approximation of the functional value gap based on a new proof strategy and exploiting this proof strategy, we give the upper bound of the marginal constraint gap. We also provide its convergence to the $ε$-approximation when two distributions are in the probability simplex. Furthermore, the convergence analysis of the OT distance gap to the $ε$-approximation is given as assisted by the obtained marginal constraint gap. The latter two theoretical results are the first results presented in the literature related to the SROT problem.

OCJul 1, 2023
Safe Screening for Unbalanced Optimal Transport

Xun Su, Zhongxi Fang, Hiroyuki Kasai

This paper introduces a framework that utilizes the Safe Screening technique to accelerate the optimization process of the Unbalanced Optimal Transport (UOT) problem by proactively identifying and eliminating zero elements in the sparse solutions. We demonstrate the feasibility of applying Safe Screening to the UOT problem with $\ell_2$-penalty and KL-penalty by conducting an analysis of the solution's bounds and considering the local strong convexity of the dual problem. Considering the specific structural characteristics of the UOT in comparison to general Lasso problems on the index matrix, we specifically propose a novel approximate projection, an elliptical safe region construction, and a two-hyperplane relaxation method. These enhancements significantly improve the screening efficiency for the UOT's without altering the algorithm's complexity.

LGMar 1, 2021Code
Manifold optimization for non-linear optimal transport problems

Bamdev Mishra, N T V Satyadev, Hiroyuki Kasai et al.

Optimal transport (OT) has recently found widespread interest in machine learning. It allows to define novel distances between probability measures, which have shown promise in several applications. In this work, we discuss how to computationally approach general non-linear OT problems within the framework of Riemannian manifold optimization. The basis of this is the manifold of doubly stochastic matrices (and their generalization). Even though the manifold geometry is not new, surprisingly, its usefulness for solving general non-linear OT problems has not been popular. To this end, we specifically discuss optimization-related ingredients that allow modeling the OT problem on smooth Riemannian manifolds by exploiting the geometry of the search space. We also discuss extensions where we reuse the developed optimization ingredients. We make available the Manifold optimization-based Optimal Transport, or MOT, repository with codes useful in solving OT problems in Python and Matlab. The codes are available at \url{https://github.com/SatyadevNtv/MOT}.

LGOct 24, 2023
Anchor Space Optimal Transport as a Fast Solution to Multiple Optimal Transport Problems

Jianming Huang, Xun Su, Zhongxi Fang et al.

In machine learning, Optimal Transport (OT) theory is extensively utilized to compare probability distributions across various applications, such as graph data represented by node distributions and image data represented by pixel distributions. In practical scenarios, it is often necessary to solve multiple OT problems. Traditionally, these problems are treated independently, with each OT problem being solved sequentially. However, the computational complexity required to solve a single OT problem is already substantial, making the resolution of multiple OT problems even more challenging. Although many applications of fast solutions to OT are based on the premise of a single OT problem with arbitrary distributions, few efforts handle such multiple OT problems with multiple distributions. Therefore, we propose the anchor space optimal transport (ASOT) problem: an approximate OT problem designed for multiple OT problems. This proposal stems from our finding that in many tasks the mass transport tends to be concentrated in a reduced space from the original feature space. By restricting the mass transport to a learned anchor point space, ASOT avoids pairwise instantiations of cost matrices for multiple OT problems and simplifies the problems by canceling insignificant transports. This simplification greatly reduces its computational costs. We then prove the upper bounds of its $1$-Wasserstein distance error between the proposed ASOT and the original OT problem under different conditions. Building upon this accomplishment, we propose three methods to learn anchor spaces for reducing the approximation error. Furthermore, our proposed methods present great advantages for handling distributions of different sizes with GPU parallelization.

LGOct 24, 2025
Noise is All You Need: Solving Linear Inverse Problems by Noise Combination Sampling with Diffusion Models

Xun Su, Hiroyuki Kasai

Pretrained diffusion models have demonstrated strong capabilities in zero-shot inverse problem solving by incorporating observation information into the generation process of the diffusion models. However, this presents an inherent dilemma: excessive integration can disrupt the generative process, while insufficient integration fails to emphasize the constraints imposed by the inverse problem. To address this, we propose \emph{Noise Combination Sampling}, a novel method that synthesizes an optimal noise vector from a noise subspace to approximate the measurement score, replacing the noise term in the standard Denoising Diffusion Probabilistic Models process. This enables conditional information to be naturally embedded into the generation process without reliance on step-wise hyperparameter tuning. Our method can be applied to a wide range of inverse problem solvers, including image compression, and, particularly when the number of generation steps $T$ is small, achieves superior performance with negligible computational overhead, significantly improving robustness and stability.

LGAug 17, 2025
Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models

Xun Su, Jianming Huang, Yang Yusen et al.

Inference-time scaling has achieved remarkable success in language models, yet its adaptation to diffusion models remains underexplored. We observe that the efficacy of recent Sequential Monte Carlo (SMC)-based methods largely stems from globally fitting the The reward-tilted distribution, which inherently preserves diversity during multi-modal search. However, current applications of SMC to diffusion models face a fundamental dilemma: early-stage noise samples offer high potential for improvement but are difficult to evaluate accurately, whereas late-stage samples can be reliably assessed but are largely irreversible. To address this exploration-exploitation trade-off, we approach the problem from the perspective of the search algorithm and propose two strategies: Funnel Schedule and Adaptive Temperature. These simple yet effective methods are tailored to the unique generation dynamics and phase-transition behavior of diffusion models. By progressively reducing the number of maintained particles and down-weighting the influence of early-stage rewards, our methods significantly enhance sample quality without increasing the total number of Noise Function Evaluations. Experimental results on multiple benchmarks and state-of-the-art text-to-image diffusion models demonstrate that our approach outperforms previous baselines.

LGFeb 3, 2025
Self-supervised Subgraph Neural Network With Deep Reinforcement Walk Exploration

Jianming Huang, Hiroyuki Kasai

Graph data, with its structurally variable nature, represents complex real-world phenomena like chemical compounds, protein structures, and social networks. Traditional Graph Neural Networks (GNNs) primarily utilize the message-passing mechanism, but their expressive power is limited and their prediction lacks explainability. To address these limitations, researchers have focused on graph substructures. Subgraph neural networks (SGNNs) and GNN explainers have emerged as potential solutions, but each has its limitations. SGNNs computes graph representations based on the bags of subgraphs to enhance the expressive power. However, they often rely on predefined algorithm-based sampling strategies, which is inefficient. GNN explainers adopt data-driven approaches to generate important subgraphs to provide explanation. Nevertheless, their explanation is difficult to be translated into practical improvements on GNNs. To overcome these issues, we propose a novel self-supervised framework that integrates SGNNs with the generation approach of GNN explainers, named the Reinforcement Walk Exploration SGNN (RWE-SGNN). Our approach features a sampling model trained in an explainer fashion, optimizing subgraphs to enhance model performance. To achieve a data-driven sampling approach, unlike traditional subgraph generation approaches, we propose a novel walk exploration process, which efficiently extracts important substructures, simplifying the embedding process and avoiding isomorphism problems. Moreover, we prove that our proposed walk exploration process has equivalent generation capability to the traditional subgraph generation process. Experimental results on various graph datasets validate the effectiveness of our proposed method, demonstrating significant improvements in performance and precision.

LGMar 10, 2021
Fast block-coordinate Frank-Wolfe algorithm for semi-relaxed optimal transport

Takumi Fukunaga, Hiroyuki Kasai

Optimal transport (OT), which provides a distance between two probability distributions by considering their spatial locations, has been applied to widely diverse applications. Computing an OT problem requires solution of linear programming with tight mass-conservation constraints. This requirement hinders its application to large-scale problems. To alleviate this issue, the recently proposed relaxed-OT approach uses a faster algorithm by relaxing such constraints. Its effectiveness for practical applications has been demonstrated. Nevertheless, it still exhibits slow convergence. To this end, addressing a convex semi-relaxed OT, we propose a fast block-coordinate Frank-Wolfe (BCFW) algorithm, which gives sparse solutions. Specifically, we provide their upper bounds of the worst convergence iterations, and equivalence between the linearization duality gap and the Lagrangian duality gap. Three fast variants of the proposed BCFW are also proposed. Numerical evaluations in color transfer problem demonstrate that the proposed algorithms outperform state-of-the-art algorithms across different settings.

LGDec 7, 2020
LCS Graph Kernel Based on Wasserstein Distance in Longest Common Subsequence Metric Space

Jianming Huang, Zhongxi Fang, Hiroyuki Kasai

For graph learning tasks, many existing methods utilize a message-passing mechanism where vertex features are updated iteratively by aggregation of neighbor information. This strategy provides an efficient means for graph features extraction, but obtained features after many iterations might contain too much information from other vertices, and tend to be similar to each other. This makes their representations less expressive. Learning graphs using paths, on the other hand, can be less adversely affected by this problem because it does not involve all vertex neighbors. However, most of them can only compare paths with the same length, which might engender information loss. To resolve this difficulty, we propose a new Graph Kernel based on a Longest Common Subsequence (LCS) similarity. Moreover, we found that the widely-used R-convolution framework is unsuitable for path-based Graph Kernel because a huge number of comparisons between dissimilar paths might deteriorate graph distances calculation. Therefore, we propose a novel metric space by exploiting the proposed LCS-based similarity, and compute a new Wasserstein-based graph distance in this metric space, which emphasizes more the comparison between similar paths. Furthermore, to reduce the computational cost, we propose an adjacent point merging operation to sparsify point clouds in the metric space.

LGNov 25, 2020
Wasserstein k-means with sparse simplex projection

Takumi Fukunaga, Hiroyuki Kasai

This paper presents a proposal of a faster Wasserstein $k$-means algorithm for histogram data by reducing Wasserstein distance computations and exploiting sparse simplex projection. We shrink data samples, centroids, and the ground cost matrix, which leads to considerable reduction of the computations used to solve optimal transport problems without loss of clustering quality. Furthermore, we dynamically reduced the computational complexity by removing lower-valued data samples and harnessing sparse simplex projection while keeping the degradation of clustering quality lower. We designate this proposed algorithm as sparse simplex projection based Wasserstein $k$-means, or SSPW $k$-means. Numerical evaluations conducted with comparison to results obtained using Wasserstein $k$-means algorithm demonstrate the effectiveness of the proposed SSPW $k$-means for real-world datasets

LGNov 25, 2020
Consistency-aware and Inconsistency-aware Graph-based Multi-view Clustering

Mitsuhiko Horie, Hiroyuki Kasai

Multi-view data analysis has gained increasing popularity because multi-view data are frequently encountered in machine learning applications. A simple but promising approach for clustering of multi-view data is multi-view clustering (MVC), which has been developed extensively to classify given subjects into some clustered groups by learning latent common features that are shared across multi-view data. Among existing approaches, graph-based multi-view clustering (GMVC) achieves state-of-the-art performance by leveraging a shared graph matrix called the unified matrix. However, existing methods including GMVC do not explicitly address inconsistent parts of input graph matrices. Consequently, they are adversely affected by unacceptable clustering performance. To this end, this paper proposes a new GMVC method that incorporates consistent and inconsistent parts lying across multiple views. This proposal is designated as CI-GMVC. Numerical evaluations of real-world datasets demonstrate the effectiveness of the proposed CI-GMVC.

LGOct 28, 2020
Graph embedding using multi-layer adjacent point merging model

Jianming Huang, Hiroyuki Kasai

For graph classification tasks, many traditional kernel methods focus on measuring the similarity between graphs. These methods have achieved great success on resolving graph isomorphism problems. However, in some classification problems, the graph class depends on not only the topological similarity of the whole graph, but also constituent subgraph patterns. To this end, we propose a novel graph embedding method using a multi-layer adjacent point merging model. This embedding method allows us to extract different subgraph patterns from train-data. Then we present a flexible loss function for feature selection which enhances the robustness of our method for different classification problems. Finally, numerical evaluations demonstrate that our proposed method outperforms many state-of-the-art methods.

OCJun 25, 2019
Riemannian optimization on the simplex of positive definite matrices

Bamdev Mishra, Hiroyuki Kasai, Pratik Jawanpuria

In this work, we generalize the probability simplex constraint to matrices, i.e., $\mathbf{X}_1 + \mathbf{X}_2 + \ldots + \mathbf{X}_K = \mathbf{I}$, where $\mathbf{X}_i \succeq 0$ is a symmetric positive semidefinite matrix of size $n\times n$ for all $i = \{1,\ldots,K \}$. By assuming positive definiteness of the matrices, we show that the constraint set arising from the matrix simplex has the structure of a smooth Riemannian submanifold. We discuss a novel Riemannian geometry for the matrix simplex manifold and show the derivation of first- and second-order optimization related ingredients.

CVFeb 11, 2019
Riemannian joint dimensionality reduction and dictionary learning on symmetric positive definite manifold

Hiroyuki Kasai, Bamdev Mishra

Dictionary leaning (DL) and dimensionality reduction (DR) are powerful tools to analyze high-dimensional noisy signals. This paper presents a proposal of a novel Riemannian joint dimensionality reduction and dictionary learning (R-JDRDL) on symmetric positive definite (SPD) manifolds for classification tasks. The joint learning considers the interaction between dimensionality reduction and dictionary learning procedures by connecting them into a unified framework. We exploit a Riemannian optimization framework for solving DL and DR problems jointly. Finally, we demonstrate that the proposed R-JDRDL outperforms existing state-of-the-arts algorithms when used for image classification tasks.

LGFeb 4, 2019
Riemannian adaptive stochastic gradient algorithms on matrix manifolds

Hiroyuki Kasai, Pratik Jawanpuria, Bamdev Mishra

Adaptive stochastic gradient algorithms in the Euclidean space have attracted much attention lately. Such explorations on Riemannian manifolds, on the other hand, are relatively new, limited, and challenging. This is because of the intrinsic non-linear structure of the underlying manifold and the absence of a canonical coordinate system. In machine learning applications, however, most manifolds of interest are represented as matrices with notions of row and column subspaces. In addition, the implicit manifold-related constraints may also lie on such subspaces. For example, the Grassmann manifold is the set of column subspaces. To this end, such a rich structure should not be lost by transforming matrices to just a stack of vectors while developing optimization algorithms on manifolds. We propose novel stochastic gradient algorithms for problems on Riemannian matrix manifolds by adapting the row and column subspaces of gradients. Our algorithms are provably convergent and they achieve the convergence rate of order $\mathcal{O}(\log (T)/\sqrt{T})$, where $T$ is the number of iterations. Our experiments illustrate the efficacy of the proposed algorithms on several applications.

MLOct 3, 2018
McTorch, a manifold optimization library for deep learning

Mayank Meghwanshi, Pratik Jawanpuria, Anoop Kunchukuttan et al.

In this paper, we introduce McTorch, a manifold optimization library for deep learning that extends PyTorch. It aims to lower the barrier for users wishing to use manifold constraints in deep learning applications, i.e., when the parameters are constrained to lie on a manifold. Such constraints include the popular orthogonality and rank constraints, and have been recently used in a number of applications in deep learning. McTorch follows PyTorch's architecture and decouples manifold definitions and optimizers, i.e., once a new manifold is added it can be used with any existing optimizer and vice-versa. McTorch is available at https://github.com/mctorch .

LGJun 14, 2018
Low-rank geometric mean metric learning

Mukul Bhutani, Pratik Jawanpuria, Hiroyuki Kasai et al.

We propose a low-rank approach to learning a Mahalanobis metric from data. Inspired by the recent geometric mean metric learning (GMML) algorithm, we propose a low-rank variant of the algorithm. This allows to jointly learn a low-dimensional subspace where the data reside and the Mahalanobis metric that appropriately fits the data. Our results show that we compete effectively with GMML at lower ranks.

NAOct 30, 2017
Stochastic variance reduced multiplicative update for nonnegative matrix factorization

Hiroyuki Kasai

Nonnegative matrix factorization (NMF), a dimensionality reduction and factor analysis method, is a special case in which factor matrices have low-rank nonnegative constraints. Considering the stochastic learning in NMF, we specifically address the multiplicative update (MU) rule, which is the most popular, but which has slow convergence property. This present paper introduces on the stochastic MU rule a variance-reduced technique of stochastic gradient. Numerical comparisons suggest that our proposed algorithms robustly outperform state-of-the-art algorithms across different synthetic and real-world datasets.

MSOct 27, 2017
SGDLibrary: A MATLAB library for stochastic gradient descent algorithms

Hiroyuki Kasai

We consider the problem of finding the minimizer of a function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ of the finite-sum form $\min f(w) = 1/n\sum_{i}^n f_i(w)$. This problem has been studied intensively in recent years in the field of machine learning (ML). One promising approach for large-scale data is to use a stochastic optimization algorithm to solve the problem. SGDLibrary is a readable, flexible and extensible pure-MATLAB library of a collection of stochastic optimization algorithms. The purpose of the library is to provide researchers and implementers a comprehensive evaluation environment for the use of these algorithms on various ML problems.

NASep 29, 2017
Fast online low-rank tensor subspace tracking by CP decomposition using recursive least squares from incomplete observations

Hiroyuki Kasai

We consider the problem of online subspace tracking of a partially observed high-dimensional data stream corrupted by noise, where we assume that the data lie in a low-dimensional linear subspace. This problem is cast as an online low-rank tensor completion problem. We propose a novel online tensor subspace tracking algorithm based on the CANDECOMP/PARAFAC (CP) decomposition, dubbed OnLine Low-rank Subspace tracking by TEnsor CP Decomposition (OLSTEC). The proposed algorithm especially addresses the case in which the subspace of interest is dynamically time-varying. To this end, we build up our proposed algorithm exploiting the recursive least squares (RLS), which is the second-order gradient algorithm. Numerical evaluations on synthetic datasets and real-world datasets such as communication network traffic, environmental data, and surveillance videos, show that the proposed OLSTEC algorithm outperforms state-of-the-art online algorithms in terms of the convergence rate per iteration.

LGMay 1, 2017
A Riemannian gossip approach to subspace learning on Grassmann manifold

Bamdev Mishra, Hiroyuki Kasai, Pratik Jawanpuria et al.

In this paper, we focus on subspace learning problems on the Grassmann manifold. Interesting applications in this setting include low-rank matrix completion and low-dimensional multivariate regression, among others. Motivated by privacy concerns, we aim to solve such problems in a decentralized setting where multiple agents have access to (and solve) only a part of the whole optimization problem. The agents communicate with each other to arrive at a consensus, i.e., agree on a common quantity, via the gossip protocol. We propose a novel cost function for subspace learning on the Grassmann manifold, which is a weighted sum of several sub-problems (each solved by an agent) and the communication cost among the agents. The cost function has a finite sum structure. In the proposed modeling approach, different agents learn individual local subspace but they achieve asymptotic consensus on the global learned subspace. The approach is scalable and parallelizable. Numerical experiments show the efficacy of the proposed decentralized algorithms on various matrix completion and multivariate regression benchmarks.

LGMar 15, 2017
Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis

Hiroyuki Kasai, Hiroyuki Sato, Bamdev Mishra

Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite number of loss functions. The present paper proposes a Riemannian stochastic quasi-Newton algorithm with variance reduction (R-SQN-VR). The key challenges of averaging, adding, and subtracting multiple gradients are addressed with notions of retraction and vector transport. We present convergence analyses of R-SQN-VR on both non-convex and retraction-convex functions under retraction and vector transport operators. The proposed algorithm is evaluated on the Karcher mean computation on the symmetric positive-definite manifold and the low-rank matrix completion on the Grassmann manifold. In all cases, the proposed algorithm outperforms the state-of-the-art Riemannian batch and stochastic gradient algorithms.

LGFeb 18, 2017
Riemannian stochastic variance reduced gradient algorithm with retraction and vector transport

Hiroyuki Sato, Hiroyuki Kasai, Bamdev Mishra

In recent years, stochastic variance reduction algorithms have attracted considerable attention for minimizing the average of a large but finite number of loss functions. This paper proposes a novel Riemannian extension of the Euclidean stochastic variance reduced gradient (R-SVRG) algorithm to a manifold search space. The key challenges of averaging, adding, and subtracting multiple gradients are addressed with retraction and vector transport. For the proposed algorithm, we present a global convergence analysis with a decaying step size as well as a local convergence rate analysis with a fixed step size under some natural assumptions. In addition, the proposed algorithm is applied to the computation problem of the Riemannian centroid on the symmetric positive definite (SPD) manifold as well as the principal component analysis and low-rank matrix completion problems on the Grassmann manifold. The results show that the proposed algorithm outperforms the standard Riemannian stochastic gradient descent algorithm in each case.

AIAug 24, 2016
State Duration and Interval Modeling in Hidden Semi-Markov Model for Sequential Data Analysis

Hiromi Narimatsu, Hiroyuki Kasai

Sequential data modeling and analysis have become indispensable tools for analyzing sequential data, such as time-series data, because larger amounts of sensed event data have become available. These methods capture the sequential structure of data of interest, such as input-output relations and correlation among datasets. However, because most studies in this area are specialized or limited to their respective applications, rigorous requirement analysis of such models has not been undertaken from a general perspective. Therefore, we particularly examine the structure of sequential data, and extract the necessity of `state duration' and `state interval' of events for efficient and rich representation of sequential data. Specifically addressing the hidden semi-Markov model (HSMM) that represents such state duration inside a model, we attempt to add representational capability of a state interval of events onto HSMM. To this end, we propose two extended models: an interval state hidden semi-Markov model (IS-HSMM) to express the length of a state interval with a special state node designated as "interval state node"; and an interval length probability hidden semi-Markov model (ILP-HSMM) which represents the length of the state interval with a new probabilistic parameter "interval length probability." Exhaustive simulations have revealed superior performance of the proposed models in comparison with HSMM. These proposed models are the first reported extensions of HMM to support state interval representation as well as state duration representation.

NIAug 19, 2016
Network Volume Anomaly Detection and Identification in Large-scale Networks based on Online Time-structured Traffic Tensor Tracking

Hiroyuki Kasai, Wolfgang Kellerer, Martin Kleinsteuber

This paper addresses network anomography, that is, the problem of inferring network-level anomalies from indirect link measurements. This problem is cast as a low-rank subspace tracking problem for normal flows under incomplete observations, and an outlier detection problem for abnormal flows. Since traffic data is large-scale time-structured data accompanied with noise and outliers under partial observations, an efficient modeling method is essential. To this end, this paper proposes an online subspace tracking of a Hankelized time-structured traffic tensor for normal flows based on the Candecomp/PARAFAC decomposition exploiting the recursive least squares (RLS) algorithm. We estimate abnormal flows as outlier sparse flows via sparsity maximization in the underlying under-constrained linear-inverse problem. A major advantage is that our algorithm estimates normal flows by low-dimensional matrices with time-directional features as well as the spatial correlation of multiple links without using the past observed measurements and the past model parameters. Extensive numerical evaluations show that the proposed algorithm achieves faster convergence per iteration of model approximation, and better volume anomaly detection performance compared to state-of-the-art algorithms.

LGMay 26, 2016
Low-rank tensor completion: a Riemannian manifold preconditioning approach

Hiroyuki Kasai, Bamdev Mishra

We propose a novel Riemannian manifold preconditioning approach for the tensor completion problem with rank constraint. A novel Riemannian metric or inner product is proposed that exploits the least-squares structure of the cost function and takes into account the structured symmetry that exists in Tucker decomposition. The specific metric allows to use the versatile framework of Riemannian optimization on quotient manifolds to develop preconditioned nonlinear conjugate gradient and stochastic gradient descent algorithms for batch and online setups, respectively. Concrete matrix representations of various optimization-related ingredients are listed. Numerical comparisons suggest that our proposed algorithms robustly outperform state-of-the-art algorithms across different synthetic and real-world datasets.

LGMay 24, 2016
Riemannian stochastic variance reduced gradient on Grassmann manifold

Hiroyuki Kasai, Hiroyuki Sato, Bamdev Mishra

Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite, number of loss functions. In this paper, we propose a novel Riemannian extension of the Euclidean stochastic variance reduced gradient algorithm (R-SVRG) to a compact manifold search space. To this end, we show the developments on the Grassmann manifold. The key challenges of averaging, addition, and subtraction of multiple gradients are addressed with notions like logarithm mapping and parallel translation of vectors on the Grassmann manifold. We present a global convergence analysis of the proposed algorithm with decay step-sizes and a local convergence rate analysis under fixed step-size with some natural assumptions. The proposed algorithm is applied on a number of problems on the Grassmann manifold like principal components analysis, low-rank matrix completion, and the Karcher mean computation. In all these cases, the proposed algorithm outperforms the standard Riemannian stochastic gradient descent algorithm.

NAAug 19, 2016
Online Low-Rank Tensor Subspace Tracking from Incomplete Data by CP Decomposition using Recursive Least Squares

Hiroyuki Kasai

We propose an online tensor subspace tracking algorithm based on the CP decomposition exploiting the recursive least squares (RLS), dubbed OnLine Low-rank Subspace tracking by TEnsor CP Decomposition (OLSTEC). Numerical evaluations show that the proposed OLSTEC algorithm gives faster convergence per iteration comparing with the state-of-the-art online algorithms.

NAMay 23, 2016
A Riemannian gossip approach to decentralized matrix completion

Bamdev Mishra, Hiroyuki Kasai, Atul Saroop

In this paper, we propose novel gossip algorithms for the low-rank decentralized matrix completion problem. The proposed approach is on the Riemannian Grassmann manifold that allows local matrix completion by different agents while achieving asymptotic consensus on the global low-rank factors. The resulting approach is scalable and parallelizable. Our numerical experiments show the good performance of the proposed algorithms on various benchmarks.

AIAug 20, 2015
Duration and Interval Hidden Markov Model for Sequential Data Analysis

Hiromi Narimatsu, Hiroyuki Kasai

Analysis of sequential event data has been recognized as one of the essential tools in data modeling and analysis field. In this paper, after the examination of its technical requirements and issues to model complex but practical situation, we propose a new sequential data model, dubbed Duration and Interval Hidden Markov Model (DI-HMM), that efficiently represents "state duration" and "state interval" of data events. This has significant implications to play an important role in representing practical time-series sequential data. This eventually provides an efficient and flexible sequential data retrieval. Numerical experiments on synthetic and real data demonstrate the efficiency and accuracy of the proposed DI-HMM.

NAJun 6, 2015
Riemannian preconditioning for tensor completion

Hiroyuki Kasai, Bamdev Mishra

We propose a novel Riemannian preconditioning approach for the tensor completion problem with rank constraint. A Riemannian metric or inner product is proposed that exploits the least-squares structure of the cost function and takes into account the structured symmetry in Tucker decomposition. The specific metric allows to use the versatile framework of Riemannian optimization on quotient manifolds to develop a preconditioned nonlinear conjugate gradient algorithm for the problem. To this end, concrete matrix representations of various optimization-related ingredients are listed. Numerical comparisons suggest that our proposed algorithm robustly outperforms state-of-the-art algorithms across different problem instances encompassing various synthetic and real-world datasets.