LGApr 15, 2022
Universal approximation property of invertible neural networksIsao Ishikawa, Takeshi Teshima, Koichi Tojo et al.
Invertible neural networks (INNs) are neural network architectures with invertibility by design. Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning. However, their attractive properties often come at the cost of restricting the layer designs, which poses a question on their representation power: can we use these models to approximate sufficiently diverse functions? To answer this question, we have developed a general theoretical framework to investigate the representation power of INNs, building on a structure theorem of differential geometry. The framework simplifies the approximation problem of diffeomorphisms, which enables us to show the universal approximation properties of INNs. We apply the framework to two representative classes of INNs, namely Coupling-Flow-based INNs (CF-INNs) and Neural Ordinary Differential Equations (NODEs), and elucidate their high representation power despite the restrictions on their architectures.
LGMar 3, 2022
Fully-Connected Network on Noncompact Symmetric Space and Ridgelet Transform based on Helgason-Fourier AnalysisSho Sonoda, Isao Ishikawa, Masahiro Ikeda
Neural network on Riemannian symmetric space such as hyperbolic space and the manifold of symmetric positive definite (SPD) matrices is an emerging subject of research in geometric deep learning. Based on the well-established framework of the Helgason-Fourier transform on the noncompact symmetric space, we present a fully-connected network and its associated ridgelet transform on the noncompact symmetric space, covering the hyperbolic neural network (HNN) and the SPDNet as special cases. The ridgelet transform is an analysis operator of a depth-2 continuous network spanned by neurons, namely, it maps an arbitrary given function to the weights of a network. Thanks to the coordinate-free reformulation, the role of nonlinear activation functions is revealed to be a wavelet function, and the reconstruction formula directly yields the universality of the proposed networks.
MLOct 21, 2022
Learning in RKHM: a $C^*$-Algebraic Twist for Kernel MachinesYuka Hashimoto, Masahiro Ikeda, Hachem Kadri
Supervised learning in reproducing kernel Hilbert space (RKHS) and vector-valued RKHS (vvRKHS) has been investigated for more than 30 years. In this paper, we provide a new twist to this rich literature by generalizing supervised learning in RKHS and vvRKHS to reproducing kernel Hilbert $C^*$-module (RKHM), and show how to construct effective positive-definite kernels by considering the perspective of $C^*$-algebra. Unlike the cases of RKHS and vvRKHS, we can use $C^*$-algebras to enlarge representation spaces. This enables us to construct RKHMs whose representation power goes beyond RKHSs, vvRKHSs, and existing methods such as convolutional neural networks. Our framework is suitable, for example, for effectively analyzing image data by allowing the interaction of Fourier components.
LGMay 30, 2022
Universality of Group Convolutional Neural Networks Based on Ridgelet Analysis on GroupsSho Sonoda, Isao Ishikawa, Masahiro Ikeda
We show the universality of depth-2 group convolutional neural networks (GCNNs) in a unified and constructive manner based on the ridgelet theory. Despite widespread use in applications, the approximation property of (G)CNNs has not been well investigated. The universality of (G)CNNs has been shown since the late 2010s. Yet, our understanding on how (G)CNNs represent functions is incomplete because the past universality theorems have been shown in a case-by-case manner by manually/carefully assigning the network parameters depending on the variety of convolution layers, and in an indirect manner by converting/modifying the (G)CNNs into other universal approximators such as invariant polynomials and fully-connected networks. In this study, we formulate a versatile depth-2 continuous GCNN $S[γ]$ as a nonlinear mapping between group representations, and directly obtain an analysis operator, called the ridgelet trasform, that maps a given function $f$ to the network parameter $γ$ so that $S[γ]=f$. The proposed GCNN covers typical GCNNs such as the cyclic convolution on multi-channel images, networks on permutation-invariant inputs (Deep Sets), and $\mathrm{E}(n)$-equivariant networks. The closed-form expression of the ridgelet transform can describe how the network parameters are organized to represent a function. While it has been known only for fully-connected networks, this study is the first to obtain the ridgelet transform for GCNNs. By discretizing the closed-form expression, we can systematically generate a constructive proof of the $cc$-universality of finite GCNNs. In other words, our universality proofs are more unified and constructive than previous proofs.
LGOct 5, 2023
Joint Group Invariant Functions on Data-Parameter Domain Induce Universal Neural NetworksSho Sonoda, Hideyuki Ishi, Isao Ishikawa et al.
The symmetry and geometry of input data are considered to be encoded in the internal data representation inside the neural network, but the specific encoding rule has been less investigated. In this study, we present a systematic method to induce a generalized neural network and its right inverse operator, called the ridgelet transform, from a joint group invariant function on the data-parameter domain. Since the ridgelet transform is an inverse, (1) it can describe the arrangement of parameters for the network to represent a target function, which is understood as the encoding rule, and (2) it implies the universality of the network. Based on the group representation theory, we present a new simple proof of the universality by using Schur's lemma in a unified manner covering a wide class of networks, for example, the original ridgelet transform, formal deep networks, and the dual voice transform. Since traditional universality theorems were demonstrated based on functional analysis, this study sheds light on the group theoretic aspect of the approximation theory, connecting geometric deep learning to abstract harmonic analysis.
DMJun 2, 2022
Dynamic Structure Estimation from Bandit Feedback using Nonvanishing Exponential SumsMotoya Ohnishi, Isao Ishikawa, Yuko Kuroki et al.
This work tackles the dynamic structure estimation problems for periodically behaved discrete dynamical system in the Euclidean space. We assume the observations become sequentially available in a form of bandit feedback contaminated by a sub-Gaussian noise. Under such fairly general assumptions on the noise distribution, we carefully identify a set of recoverable information of periodic structures. Our main results are the (computation and sample) efficient algorithms that exploit asymptotic behaviors of exponential sums to effectively average out the noise effect while preventing the information to be estimated from vanishing. In particular, the novel use of the Weyl sum, a variant of exponential sums, allows us to extract spectrum information for linear systems. We provide sample complexity bounds for our algorithms, and we experimentally validate our theoretical claims on simulations of toy examples, including Cellular Automata.
LGOct 5, 2023
Deep Ridgelet Transform: Voice with Koopman Operator Proves Universality of Formal Deep NetworksSho Sonoda, Yuka Hashimoto, Isao Ishikawa et al.
We identify hidden layers inside a deep neural network (DNN) with group actions on the data domain, and formulate a formal deep network as a dual voice transform with respect to the Koopman operator, a linear representation of the group action. Based on the group theoretic arguments, particularly by using Schur's lemma, we show a simple proof of the universality of DNNs.
DSMar 4, 2024
Koopman operators with intrinsic observables in rigged reproducing kernel Hilbert spacesIsao Ishikawa, Yuka Hashimoto, Masahiro Ikeda et al.
This paper presents a novel approach for estimating the Koopman operator defined on a reproducing kernel Hilbert space (RKHS) and its spectra. We propose an estimation method, what we call Jet Dynamic Mode Decomposition (JetDMD), leveraging the intrinsic structure of RKHS and the geometric notion known as jets to enhance the estimation of the Koopman operator. This method refines the traditional Extended Dynamic Mode Decomposition (EDMD) in accuracy, especially in the numerical estimation of eigenvalues. This paper proves JetDMD's superiority through explicit error bounds and convergence rate for special positive definite kernels, offering a solid theoretical foundation for its performance. We also delve into the spectral analysis of the Koopman operator, proposing the notion of extended Koopman operator within a framework of rigged Hilbert space. This notion leads to a deeper understanding of estimated Koopman eigenfunctions and capturing them outside the original function space. Through the theory of rigged Hilbert space, our study provides a principled methodology to analyze the estimated spectrum and eigenfunctions of Koopman operators, and enables eigendecomposition within a rigged RKHS. We also propose a new effective method for reconstructing the dynamical system from temporally-sampled trajectory data of the dynamical system with solid theoretical guarantee. We conduct several numerical simulations using the van der Pol oscillator, the Duffing oscillator, the Hénon map, and the Lorenz attractor, and illustrate the performance of JetDMD with clear numerical computations of eigenvalues and accurate predictions of the dynamical systems.
LGFeb 25, 2024
A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networksSho Sonoda, Isao Ishikawa, Masahiro Ikeda
To investigate neural network parameters, it is easier to study the distribution of parameters than to study the parameters in each neuron. The ridgelet transform is a pseudo-inverse operator that maps a given function $f$ to the parameter distribution $γ$ so that a network $\mathtt{NN}[γ]$ reproduces $f$, i.e. $\mathtt{NN}[γ]=f$. For depth-2 fully-connected networks on a Euclidean space, the ridgelet transform has been discovered up to the closed-form expression, thus we could describe how the parameters are distributed. However, for a variety of modern neural network architectures, the closed-form expression has not been known. In this paper, we explain a systematic method using Fourier expressions to derive ridgelet transforms for a variety of modern networks such as networks on finite fields $\mathbb{F}_p$, group convolutional networks on abstract Hilbert space $\mathcal{H}$, fully-connected networks on noncompact symmetric spaces $G/K$, and pooling layers, or the $d$-plane ridgelet transform.
LGFeb 4, 2024
$C^*$-Algebraic Machine Learning: Moving in a New DirectionYuka Hashimoto, Masahiro Ikeda, Hachem Kadri
Machine learning has a long collaborative tradition with several fields of mathematics, such as statistics, probability and linear algebra. We propose a new direction for machine learning research: $C^*$-algebraic ML $-$ a cross-fertilization between $C^*$-algebra and machine learning. The mathematical concept of $C^*$-algebra is a natural generalization of the space of complex numbers. It enables us to unify existing learning strategies, and construct a new framework for more diverse and information-rich data models. We explain why and how to use $C^*$-algebras in machine learning, and provide technical considerations that go into the design of $C^*$-algebraic learning models in the contexts of kernel methods and neural networks. Furthermore, we discuss open questions and challenges in $C^*$-algebraic ML and give our thoughts for future development and applications.
LGMay 22, 2024
Deep Ridgelet Transform and Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant MachinesSho Sonoda, Yuka Hashimoto, Isao Ishikawa et al.
We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps, called the joint-equivariant machines, based on the group representation theory. ``Constructive'' here indicates that the distribution of parameters is given in a closed-form expression known as the ridgelet transform. Joint-group-equivariance encompasses a broad class of feature maps that generalize classical group-equivariance. Particularly, fully-connected networks are not group-equivariant but are joint-group-equivariant. Our main theorem also unifies the universal approximation theorems for both shallow and deep networks. Until this study, the universality of deep networks has been shown in a different manner from the universality of shallow networks, but our results discuss them on common ground. Now we can understand the approximation schemes of various learning machines in a unified manner. As applications, we show the constructive universal approximation properties of four examples: depth-$n$ joint-equivariant machine, depth-$n$ fully-connected network, depth-$n$ group-convolutional network, and a new depth-$2$ network with quadratic forms whose universality has not been known.
LGSep 26, 2025
Why High-rank Neural Networks Generalize?: An Algebraic Framework with RKHSsYuka Hashimoto, Sho Sonoda, Isao Ishikawa et al.
We derive a new Rademacher complexity bound for deep neural networks using Koopman operators, group representations, and reproducing kernel Hilbert spaces (RKHSs). The proposed bound describes why the models with high-rank weight matrices generalize well. Although there are existing bounds that attempt to describe this phenomenon, these existing bounds can be applied to limited types of models. We introduce an algebraic representation of neural networks and a kernel function to construct an RKHS to derive a bound for a wider range of realistic models. This work paves the way for the Koopman-based theory for Rademacher complexity bounds to be valid for more practical situations.
LGMay 21, 2025
Why and When Deep is Better than Shallow: An Implementation-Agnostic State-Transition View of Depth SupremacySho Sonoda, Yuka Hashimoto, Isao Ishikawa et al.
Why and when is deep better than shallow? We answer this question in a framework that is agnostic to network implementation. We formulate a deep model as an abstract state-transition semigroup acting on a general metric space, and separate the implementation (e.g., ReLU nets, transformers, and chain-of-thought) from the abstract state transition. We prove a bias-variance decomposition in which the variance depends only on the abstract depth-$k$ network and not on the implementation (Theorem 1). We further split the bounds into output and hidden parts to tie the depth dependence of the variance to the metric entropy of the state-transition semigroup (Theorem 2). We then investigate implementation-free conditions under which the variance grow polynomially or logarithmically with depth (Section 4). Combining these with exponential or polynomial bias decay identifies four canonical bias-variance trade-off regimes (EL/EP/PL/PP) and produces explicit optimal depths $k^\ast$. Across regimes, $k^\ast>1$ typically holds, giving a rigorous form of depth supremacy. The lowest generalization error bound is achieved under the EL regime (exp-decay bias + log-growth variance), explaining why and when deep is better, especially for iterative or hierarchical concept classes such as neural ODEs, diffusion/score-matching models, and chain-of-thought reasoning.
MLMay 23, 2023
Deep Learning with Kernels through RKHM and the Perron-Frobenius OperatorYuka Hashimoto, Masahiro Ikeda, Hachem Kadri
Reproducing kernel Hilbert $C^*$-module (RKHM) is a generalization of reproducing kernel Hilbert space (RKHS) by means of $C^*$-algebra, and the Perron-Frobenius operator is a linear operator related to the composition of functions. Combining these two concepts, we present deep RKHM, a deep learning framework for kernel methods. We derive a new Rademacher generalization bound in this setting and provide a theoretical interpretation of benign overfitting by means of Perron-Frobenius operators. By virtue of $C^*$-algebra, the dependency of the bound on output dimension is milder than existing bounds. We show that $C^*$-algebra is a suitable tool for deep learning with kernels, enabling us to take advantage of the product structure of operators and to provide a clear connection with convolutional neural networks. Our theoretical analysis provides a new lens through which one can design and analyze deep kernel methods.
LGJun 30, 2021
Koopman Spectrum Nonlinear Regulators and Efficient Online LearningMotoya Ohnishi, Isao Ishikawa, Kendall Lowrey et al.
Most modern reinforcement learning algorithms optimize a cumulative single-step cost along a trajectory. The optimized motions are often 'unnatural', representing, for example, behaviors with sudden accelerations that waste energy and lack predictability. In this work, we present a novel paradigm of controlling nonlinear systems via the minimization of the Koopman spectrum cost: a cost over the Koopman operator of the controlled dynamics. This induces a broader class of dynamical behaviors that evolve over stable manifolds such as nonlinear oscillators, closed loops, and smooth movements. We demonstrate that some dynamics characterizations that are not possible with a cumulative cost are feasible in this paradigm, which generalizes the classical eigenstructure and pole assignments to nonlinear decision making. Moreover, we present a sample efficient online learning algorithm for our problem that enjoys a sub-linear regret bound under some structural assumptions.
LGJun 9, 2021
Ghosts in Neural Networks: Existence, Structure and Role of Infinite-Dimensional Null SpaceSho Sonoda, Isao Ishikawa, Masahiro Ikeda
Overparametrization has been remarkably successful for deep learning studies. This study investigates an overlooked but important aspect of overparametrized neural networks, that is, the null components in the parameters of neural networks, or the ghosts. Since deep learning is not explicitly regularized, typical deep learning solutions contain null components. In this paper, we present a structure theorem of the null space for a general class of neural networks. Specifically, we show that any null element can be uniquely written by the linear combination of ridgelet transforms. In general, it is quite difficult to fully characterize the null space of an arbitrarily given operator. Therefore, the structure theorem is a great advantage for understanding a complicated landscape of neural network parameters. As applications, we discuss the roles of ghosts on the generalization performance of deep learning.
MLJan 27, 2021
Reproducing kernel Hilbert C*-module and kernel mean embeddingsYuka Hashimoto, Isao Ishikawa, Masahiro Ikeda et al.
Kernel methods have been among the most popular techniques in machine learning, where learning tasks are solved using the property of reproducing kernel Hilbert space (RKHS). In this paper, we propose a novel data analysis framework with reproducing kernel Hilbert $C^*$-module (RKHM) and kernel mean embedding (KME) in RKHM. Since RKHM contains richer information than RKHS or vector-valued RKHS (vvRKHS), analysis with RKHM enables us to capture and extract structural properties in such as functional data. We show a branch of theories for RKHM to apply to data analysis, including the representer theorem, and the injectivity and universality of the proposed KME. We also show RKHM generalizes RKHS and vvRKHS. Then, we provide concrete procedures for employing RKHM and the proposed KME to data analysis.
LGDec 4, 2020
Universal Approximation Property of Neural Ordinary Differential EquationsTakeshi Teshima, Koichi Tojo, Masahiro Ikeda et al.
Neural ordinary differential equations (NODEs) is an invertible neural network architecture promising for its free-form Jacobian and the availability of a tractable Jacobian determinant estimator. Recently, the representation power of NODEs has been partly uncovered: they form an $L^p$-universal approximator for continuous maps under certain conditions. However, the $L^p$-universality may fail to guarantee an approximation for the entire input domain as it may still hold even if the approximator largely differs from the target function on a small region of the input space. To further uncover the potential of NODEs, we show their stronger approximation property, namely the $\sup$-universality for approximating a large class of diffeomorphisms. It is shown by leveraging a structure theorem of the diffeomorphism group, and the result complements the existing literature by establishing a fairly large set of mappings that NODEs can approximate with a stronger guarantee.
LGNov 20, 2020
A global universality of two-layer neural networks with ReLU activationsNaoya Hatano, Masahiro Ikeda, Isao Ishikawa et al.
In the present study, we investigate a universality of neural networks, which concerns a density of the set of two-layer neural networks in a function spaces. There are many works that handle the convergence over compact sets. In the present paper, we consider a global convergence by introducing a norm suitably, so that our results will be uniform over any compact set.
MLJul 29, 2020
Kernel Mean Embeddings of Von Neumann-Algebra-Valued MeasuresYuka Hashimoto, Isao Ishikawa, Masahiro Ikeda et al.
Kernel mean embedding (KME) is a powerful tool to analyze probability measures for data, where the measures are conventionally embedded into a reproducing kernel Hilbert space (RKHS). In this paper, we generalize KME to that of von Neumann-algebra-valued measures into reproducing kernel Hilbert modules (RKHMs), which provides an inner product and distance between von Neumann-algebra-valued measures. Von Neumann-algebra-valued measures can, for example, encode relations between arbitrary pairs of variables in a multivariate distribution or positive operator-valued measures for quantum mechanics. Thus, this allows us to perform probabilistic analyses explicitly reflected with higher-order interactions among variables, and provides a way of applying machine learning frameworks to problems in quantum mechanics. We also show that the injectivity of the existing KME and the universality of RKHS are generalized to RKHM, which confirms many useful features of the existing KME remain in our generalized KME. And, we investigate the empirical performance of our methods using synthetic and real-world data.
LGJul 7, 2020
Ridge Regression with Over-Parametrized Two-Layer Networks Converge to Ridgelet SpectrumSho Sonoda, Isao Ishikawa, Masahiro Ikeda
Characterization of local minima draws much attention in theoretical studies of deep learning. In this study, we investigate the distribution of parameters in an over-parametrized finite neural network trained by ridge regularized empirical square risk minimization (RERM). We develop a new theory of ridgelet transform, a wavelet-like integral transform that provides a powerful and general framework for the theoretical study of neural networks involving not only the ReLU but general activation functions. We show that the distribution of the parameters converges to a spectrum of the ridgelet transform. This result provides a new insight into the characterization of the local minima of neural networks, and the theoretical background of an inductive bias theory based on lazy regimes. We confirm the visual resemblance between the parameter distribution trained by SGD, and the ridgelet spectrum calculated by numerical integration through numerical experiments with finite models.
LGJun 20, 2020
Coupling-based Invertible Neural Networks Are Universal Diffeomorphism ApproximatorsTakeshi Teshima, Isao Ishikawa, Koichi Tojo et al.
Invertible neural networks based on coupling flows (CF-INNs) have various machine learning applications such as image synthesis and representation learning. However, their desirable characteristics such as analytic invertibility come at the cost of restricting the functional forms. This poses a question on their representation power: are CF-INNs universal approximators for invertible functions? Without a universality, there could be a well-behaved invertible transformation that the CF-INN can never approximate, hence it would render the model class unreliable. We answer this question by showing a convenient criterion: a CF-INN is universal if its layers contain affine coupling and invertible linear functions as special cases. As its corollary, we can affirmatively resolve a previously unsolved problem: whether normalizing flow models based on affine coupling can be universal distributional approximators. In the course of proving the universality, we prove a general theorem to show the equivalence of the universality for certain diffeomorphism classes, a theoretical insight that is of interest by itself.
DSJun 15, 2020
Hypergraph Clustering Based on PageRankYuuki Takai, Atsushi Miyauchi, Masahiro Ikeda et al.
A hypergraph is a useful combinatorial object to model ternary or higher-order relations among entities. Clustering hypergraphs is a fundamental task in network analysis. In this study, we develop two clustering algorithms based on personalized PageRank on hypergraphs. The first one is local in the sense that its goal is to find a tightly connected vertex set with a bounded volume including a specified vertex. The second one is global in the sense that its goal is to find a tightly connected vertex set. For both algorithms, we discuss theoretical guarantees on the conductance of the output vertex set. Also, we experimentally demonstrate that our clustering algorithms outperform existing methods in terms of both the solution quality and running time. To the best of our knowledge, ours are the first practical algorithms for hypergraphs with theoretical guarantees on the conductance of the output set.
MLMar 2, 2020
Analysis via Orthonormal Systems in Reproducing Kernel Hilbert $C^*$-Modules and ApplicationsYuka Hashimoto, Isao Ishikawa, Masahiro Ikeda et al.
Kernel methods have been among the most popular techniques in machine learning, where learning tasks are solved using the property of reproducing kernel Hilbert space (RKHS). In this paper, we propose a novel data analysis framework with reproducing kernel Hilbert $C^*$-module (RKHM), which is another generalization of RKHS than vector-valued RKHS (vv-RKHS). Analysis with RKHMs enables us to deal with structures among variables more explicitly than vv-RKHS. We show the theoretical validity for the construction of orthonormal systems in Hilbert $C^*$-modules, and derive concrete procedures for orthonormalization in RKHMs with those theoretical properties in numerical computations. Moreover, we apply those to generalize with RKHM kernel principal component analysis and the analysis of dynamical systems with Perron-Frobenius operators. The empirical performance of our methods is also investigated by using synthetic and real-world data.
FANov 27, 2019
Composition operators on reproducing kernel Hilbert spaces with analytic positive definite functionsMasahiro Ikeda, Isao Ishikawa, Yoshihiro Sawano
In this paper, we specify what functions induce the bounded composition operators on a reproducing kernel Hilbert space (RKHS) associated with an analytic positive definite function defined on $\mathbf{R}^d$. We prove that only affine transforms can do so in a pretty large class of RKHS. Our result covers not only the Paley-Wiener space on the real line, studied in previous works, but also much more general RKHSs corresponding to analytic positive definite functions where existing methods do not work. Our method only relies on an intrinsic properties of the RKHSs, and we establish a connection between the behavior of composition operators and the asymptotic properties of the greatest zeros of orthogonal polynomials on a weighted $L^2$-spaces on the real line. We also investigate the compactness of the composition operators and show that any bounded composition operators cannot be compact in our situation.
LGSep 9, 2019
Krylov Subspace Method for Nonlinear Dynamical Systems with Random NoiseYuka Hashimoto, Isao Ishikawa, Masahiro Ikeda et al.
Operator-theoretic analysis of nonlinear dynamical systems has attracted much attention in a variety of engineering and scientific fields, endowed with practical estimation methods using data such as dynamic mode decomposition. In this paper, we address a lifted representation of nonlinear dynamical systems with random noise based on transfer operators, and develop a novel Krylov subspace method for estimating the operators using finite data, with consideration of the unboundedness of operators. For this purpose, we first consider Perron-Frobenius operators with kernel-mean embeddings for such systems. We then extend the Arnoldi method, which is the most classical type of Kryov subspace method, so that it can be applied to the current case. Meanwhile, the Arnoldi method requires the assumption that the operator is bounded, which is not necessarily satisfied for transfer operators on nonlinear systems. We accordingly develop the shift-invert Arnoldi method for Perron-Frobenius operators to avoid this problem. Also, we describe an approach of evaluating predictive accuracy by estimated operators on the basis of the maximum mean discrepancy, which is applicable, for example, to anomaly detection in complex systems. The empirical performance of our methods is investigated using synthetic and real-world healthcare data.
MLJun 17, 2019
Metric on random dynamical systems with vector-valued reproducing kernel Hilbert spacesIsao Ishikawa, Akinori Tanaka, Masahiro Ikeda et al.
Development of metrics for structural data-generating mechanisms is fundamental in machine learning and the related fields. In this paper, we give a general framework to construct metrics on random nonlinear dynamical systems, defined with the Perron-Frobenius operators in vector-valued reproducing kernel Hilbert spaces (vvRKHSs). We employ vvRKHSs to design mathematically manageable metrics and also to introduce operator-valued kernels, which enables us to handle randomness in systems. Our metric provides an extension of the existing metrics for deterministic systems, and gives a specification of the kernel maximal mean discrepancy of random processes. Moreover, by considering the time-wise independence of random processes, we clarify a connection between our metric and the independence criteria with kernels such as Hilbert-Schmidt independence criteria. We empirically illustrate our metric with synthetic data, and evaluate it in the context of the independence test for random processes. We also evaluate the performance with real time seris datas via clusering tasks.
MLMay 31, 2018
Metric on Nonlinear Dynamical Systems with Perron-Frobenius OperatorsIsao Ishikawa, Keisuke Fujii, Masahiro Ikeda et al.
The development of a metric for structural data is a long-term problem in pattern recognition and machine learning. In this paper, we develop a general metric for comparing nonlinear dynamical systems that is defined with Perron-Frobenius operators in reproducing kernel Hilbert spaces. Our metric includes the existing fundamental metrics for dynamical systems, which are basically defined with principal angles between some appropriately-chosen subspaces, as its special cases. We also describe the estimation of our metric from finite data. We empirically illustrate our metric with an example of rotation dynamics in a unit disk in a complex plane, and evaluate the performance with real-world time-series data.
MLMay 19, 2018
The global optimum of shallow neural network is attained by ridgelet transformSho Sonoda, Isao Ishikawa, Masahiro Ikeda et al.
We prove that the global minimum of the backpropagation (BP) training problem of neural networks with an arbitrary nonlinear activation is given by the ridgelet transform. A series of computational experiments show that there exists an interesting similarity between the scatter plot of hidden parameters in a shallow neural network after the BP training and the spectrum of the ridgelet transform. By introducing a continuous model of neural networks, we reduce the training problem to a convex optimization in an infinite dimensional Hilbert space, and obtain the explicit expression of the global optimizer via the ridgelet transform.
MMFeb 24, 2017
Software Defined Media: Virtualization of Audio-Visual ServicesManabu Tsukada, Keiko Ogawa, Masahiro Ikeda et al.
Internet-native audio-visual services are witnessing rapid development. Among these services, object-based audio-visual services are gaining importance. In 2014, we established the Software Defined Media (SDM) consortium to target new research areas and markets involving object-based digital media and Internet-by-design audio-visual environments. In this paper, we introduce the SDM architecture that virtualizes networked audio-visual services along with the development of smart buildings and smart cities using Internet of Things (IoT) devices and smart building facilities. Moreover, we design the SDM architecture as a layered architecture to promote the development of innovative applications on the basis of rapid advancements in software-defined networking (SDN). Then, we implement a prototype system based on the architecture, present the system at an exhibition, and provide it as an SDM API to application developers at hackathons. Various types of applications are developed using the API at these events. An evaluation of SDM API access shows that the prototype SDM platform effectively provides 3D audio reproducibility and interactiveness for SDM applications.