CVJul 20, 2022Code
On the Versatile Uses of Partial Distance Correlation in Deep LearningXingjian Zhen, Zihang Meng, Rudrasis Chakraborty et al.
Comparing the functional behavior of neural network models, whether it is a single network over time or two (or more networks) during or post-training, is an essential step in understanding what they are learning (and what they are not), and for identifying strategies for regularization or efficiency improvements. Despite recent progress, e.g., comparing vision transformers to CNNs, systematic comparison of function, especially across different networks, remains difficult and is often carried out layer by layer. Approaches such as canonical correlation analysis (CCA) are applicable in principle, but have been sparingly used so far. In this paper, we revisit a (less widely known) from statistics, called distance correlation (and its partial variant), designed to evaluate correlation between feature spaces of different dimensions. We describe the steps necessary to carry out its deployment for large scale models -- this opens the door to a surprising array of applications ranging from conditioning one deep model w.r.t. another, learning disentangled representations as well as optimizing diverse models that would directly be more robust to adversarial attacks. Our experiments suggest a versatile regularizer (or constraint) with many advantages, which avoids some of the common difficulties one faces in such analyses. Code is at https://github.com/zhenxingjian/Partial_Distance_Correlation.
LGFeb 6Code
Beyond Pooling: Matching for Robust Generalization under Data HeterogeneityAyush Roy, Rudrasis Chakraborty, Lav Varshney et al.
Pooling heterogeneous datasets across domains is a common strategy in representation learning, but naive pooling can amplify distributional asymmetries and yield biased estimators, especially in settings where zero-shot generalization is required. We propose a matching framework that selects samples relative to an adaptive centroid and iteratively refines the representation distribution. The double robustness and the propensity score matching for the inclusion of data domains make matching more robust than naive pooling and uniform subsampling by filtering out the confounding domains (the main cause of heterogeneity). Theoretical and empirical analyses show that, unlike naive pooling or uniform subsampling, matching achieves better results under asymmetric meta-distributions, which are also extended to non-Gaussian and multimodal real-world settings. Most importantly, we show that these improvements translate to zero-shot medical anomaly detection, one of the extreme forms of data heterogeneity and asymmetry. The code is available on https://github.com/AyushRoy2001/Beyond-Pooling.
LGMar 29, 2022
Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging DatasetsVishnu Suresh Lokhande, Rudrasis Chakraborty, Sathya N. Ravi et al.
Pooling multiple neuroimaging datasets across institutions often enables improvements in statistical power when evaluating associations (e.g., between risk factors and disease outcomes) that may otherwise be too weak to detect. When there is only a {\em single} source of variability (e.g., different scanners), domain adaptation and matching the distributions of representations may suffice in many scenarios. But in the presence of {\em more than one} nuisance variable which concurrently influence the measurements, pooling datasets poses unique challenges, e.g., variations in the data can come from both the acquisition method as well as the demographics of participants (gender, age). Invariant representation learning, by itself, is ill-suited to fully model the data generation process. In this paper, we show how bringing recent results on equivariant representation learning (for studying symmetries in neural networks) instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution. In particular, we demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
CVFeb 26
ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset DistillationAyush Roy, Wei-Yang Alex Lee, Rudrasis Chakraborty et al.
In recent times, large datasets hinder efficient model training while also containing redundant concepts. Dataset distillation aims to synthesize compact datasets that preserve the knowledge of large-scale training sets while drastically reducing storage and computation. Recent advances in diffusion models have enabled training-free distillation by leveraging pre-trained generative priors; however, existing guidance strategies remain limited. Current score-based methods either perform unguided denoising or rely on simple mode-based guidance toward instance prototype centroids (IPC centroids), which often are rudimentary and suboptimal. We propose Manifold-Guided Distillation (ManifoldGD), a training-free diffusion-based framework that integrates manifold consistent guidance at every denoising timestep. Our method employs IPCs computed via a hierarchical, divisive clustering of VAE latent features, yielding a multi-scale coreset of IPCs that captures both coarse semantic modes and fine intra-class variability. Using a local neighborhood of the extracted IPC centroids, we create the latent manifold for each diffusion denoising timestep. At each denoising step, we project the mode-alignment vector onto the local tangent space of the estimated latent manifold, thus constraining the generation trajectory to remain manifold-faithful while preserving semantic consistency. This formulation improves representativeness, diversity, and image fidelity without requiring any model retraining. Empirical results demonstrate consistent gains over existing training-free and training-based baselines in terms of FID, l2 distance among real and synthetic dataset embeddings, and classification accuracy, establishing ManifoldGD as the first geometry-aware training-free data distillation framework.
CLFeb 7, 2021Code
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-AttentionYunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty et al.
Transformers have emerged as a powerful tool for a broad range of natural language processing tasks. A key component that drives the impressive performance of Transformers is the self-attention mechanism that encodes the influence or dependence of other tokens on each specific token. While beneficial, the quadratic complexity of self-attention on the input sequence length has limited its application to longer sequences -- a topic being actively studied in the community. To address this limitation, we propose Nyströmformer -- a model that exhibits favorable scalability as a function of sequence length. Our idea is based on adapting the Nyström method to approximate standard self-attention with $O(n)$ complexity. The scalability of Nyströmformer enables application to longer sequences with thousands of tokens. We perform evaluations on multiple downstream tasks on the GLUE benchmark and IMDB reviews with standard sequence length, and find that our Nyströmformer performs comparably, or in a few cases, even slightly better, than standard self-attention. On longer sequence tasks in the Long Range Arena (LRA) benchmark, Nyströmformer performs favorably relative to other efficient self-attention methods. Our code is available at https://github.com/mlpen/Nystromformer.
CVNov 27, 2019Code
Orthogonal Convolutional Neural NetworksJiayun Wang, Yubei Chen, Rudrasis Chakraborty et al.
Deep convolutional neural networks are hindered by training instability and feature redundancy towards further performance improvement. A promising solution is to impose orthogonality on convolutional filters. We develop an efficient approach to impose filter orthogonality on a convolutional layer based on the doubly block-Toeplitz matrix representation of the convolutional kernel instead of using the common kernel orthogonality approach, which we show is only necessary but not sufficient for ensuring orthogonal convolutions. Our proposed orthogonal convolution requires no additional parameters and little computational overhead. This method consistently outperforms the kernel orthogonality alternative on a wide range of tasks such as image classification and inpainting under supervised, semi-supervised and unsupervised settings. Further, it learns more diverse and expressive features with better training stability, robustness, and generalization. Our code is publicly available at https://github.com/samaonline/Orthogonal-Convolutional-Neural-Networks.
CVJun 26, 2019Code
Spatial Transformer for 3D Point CloudsJiayun Wang, Rudrasis Chakraborty, Stella X. Yu
Deep neural networks are widely used for understanding 3D point clouds. At each point convolution layer, features are computed from local neighborhoods of 3D points and combined for subsequent processing in order to extract semantic information. Existing methods adopt the same individual point neighborhoods throughout the network layers, defined by the same metric on the fixed input point coordinates. This common practice is easy to implement but not necessarily optimal. Ideally, local neighborhoods should be different at different layers, as more latent information is extracted at deeper layers. We propose a novel end-to-end approach to learn different non-rigid transformations of the input point cloud so that optimal local neighborhoods can be adopted at each layer. We propose both linear (affine) and non-linear (projective and deformable) spatial transformers for 3D point clouds. With spatial transformers on the ShapeNet part segmentation dataset, the network achieves higher accuracy for all categories, with 8\% gain on earphones and rockets in particular. Our method also outperforms the state-of-the-art on other point cloud tasks such as classification, detection, and semantic segmentation. Visualizations show that spatial transformers can learn features more efficiently by dynamically altering local neighborhoods according to the geometry and semantics of 3D shapes in spite of their within-category variations. Our code is publicly available at https://github.com/samaonline/spatial-transformer-for-3d-point-clouds.
LGJun 26, 2025
FoGE: Fock Space inspired encoding for graph promptingSotirios Panagiotis Chytas, Rudrasis Chakraborty, Vikas Singh
Recent results show that modern Large Language Models (LLM) are indeed capable of understanding and answering questions about structured data such as graphs. This new paradigm can lead to solutions that require less supervision while, at the same time, providing a model that can generalize and answer questions beyond the training labels. Existing proposals often use some description of the graph to create an ``augmented'' prompt fed to the LLM. For a chosen class of graphs, if a well-tailored graph encoder is deployed to play together with a pre-trained LLM, the model can answer graph-related questions well. Existing solutions to graph-based prompts range from graph serialization to graph transformers. In this work, we show that the use of a parameter-free graph encoder based on Fock space representations, a concept borrowed from mathematical physics, is remarkably versatile in this problem setting. The simple construction, inherited directly from the theory with a few small adjustments, can provide rich and informative graph encodings, for a wide range of different graphs. We investigate the use of this idea for prefix-tuned prompts leveraging the capabilities of a pre-trained, frozen LLM. The modifications lead to a model that can answer graph-related questions -- from simple graphs to proteins to hypergraphs -- effectively and with minimal, if any, adjustments to the architecture. Our work significantly simplifies existing solutions and generalizes well to multiple different graph-based structures effortlessly.
LGMar 18, 2024
Variational Sampling of Temporal TrajectoriesJurijs Nazarovs, Zhichun Huang, Xingjian Zhen et al.
A deterministic temporal process can be determined by its trajectory, an element in the product space of (a) initial condition $z_0 \in \mathcal{Z}$ and (b) transition function $f: (\mathcal{Z}, \mathcal{T}) \to \mathcal{Z}$ often influenced by the control of the underlying dynamical system. Existing methods often model the transition function as a differential equation or as a recurrent neural network. Despite their effectiveness in predicting future measurements, few results have successfully established a method for sampling and statistical inference of trajectories using neural networks, partially due to constraints in the parameterization. In this work, we introduce a mechanism to learn the distribution of trajectories by parameterizing the transition function $f$ explicitly as an element in a function space. Our framework allows efficient synthesis of novel trajectories, while also directly providing a convenient tool for inference, i.e., uncertainty estimation, likelihood evaluations and out of distribution detection for abnormal trajectories. These capabilities can have implications for various downstream tasks, e.g., simulation and evaluation for reinforcement learning.
LGFeb 18, 2022
Mixed Effects Neural ODE: A Variational Approximation for Analyzing the Dynamics of Panel DataJurijs Nazarovs, Rudrasis Chakraborty, Songwong Tasneeyapant et al.
Panel data involving longitudinal measurements of the same set of participants taken over multiple time points is common in studies to understand childhood development and disease modeling. Deep hybrid models that marry the predictive power of neural networks with physical simulators such as differential equations, are starting to drive advances in such applications. The task of modeling not just the observations but the hidden dynamics that are captured by the measurements poses interesting statistical/computational questions. We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing such panel data. We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem. We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms using MC based sampling methods and numerical ODE solvers. We demonstrate ME-NODE's utility on tasks spanning the spectrum from simulations and toy data to real longitudinal 3D imaging data from an Alzheimer's disease (AD) study, and study its performance in terms of accuracy of reconstruction for interpolation, uncertainty estimates and personalized prediction.
LGDec 1, 2021
Forward Operator Estimation in Generative Models with Kernel Transfer OperatorsZhichun Huang, Rudrasis Chakraborty, Vikas Singh
Generative models which use explicit density modeling (e.g., variational autoencoders, flow-based generative models) involve finding a mapping from a known distribution, e.g. Gaussian, to the unknown input distribution. This often requires searching over a class of non-linear functions (e.g., representable by a deep neural network). While effective in practice, the associated runtime/memory costs can increase rapidly, usually as a function of the performance desired in an application. We propose a much cheaper (and simpler) strategy to estimate this mapping based on adapting known results in kernel transfer operators. We show that our formulation enables highly efficient distribution approximation and sampling, and offers surprisingly good empirical performance that compares favorably with powerful baselines, but with significant runtime savings. We show that the algorithm also performs well in small sample size settings (in brain imaging).
LGJun 8, 2021
An Online Riemannian PCA for Stochastic Canonical Correlation AnalysisZihang Meng, Rudrasis Chakraborty, Vikas Singh
We present an efficient stochastic algorithm (RSG+) for canonical correlation analysis (CCA) using a reparametrization of the projection matrices. We show how this reparametrization (into structured matrices), simple in hindsight, directly presents an opportunity to repurpose/adjust mature techniques for numerical optimization on Riemannian manifolds. Our developments nicely complement existing methods for this problem which either require $O(d^3)$ time complexity per iteration with $O(\frac{1}{\sqrt{t}})$ convergence rate (where $d$ is the dimensionality) or only extract the top $1$ component with $O(\frac{1}{t})$ convergence rate. In contrast, our algorithm offers a strict improvement for this classical problem: it achieves $O(d^2k)$ runtime complexity per iteration for extracting the top $k$ canonical components with $O(\frac{1}{t})$ convergence rate. While the paper primarily focuses on the formulation and technical analysis of its properties, our experiments show that the empirical behavior on common datasets is quite promising. We also explore a potential application in training fair models where the label of protected attribute is missing or otherwise unavailable.
CVJun 5, 2021
VolterraNet: A higher order convolutional network with group equivariance for homogeneous manifoldsMonami Banerjee, Rudrasis Chakraborty, Jose Bouza et al.
Convolutional neural networks have been highly successful in image-based learning tasks due to their translation equivariance property. Recent work has generalized the traditional convolutional layer of a convolutional neural network to non-Euclidean spaces and shown group equivariance of the generalized convolution operation. In this paper, we present a novel higher order Volterra convolutional neural network (VolterraNet) for data defined as samples of functions on Riemannian homogeneous spaces. Analagous to the result for traditional convolutions, we prove that the Volterra functional convolutions are equivariant to the action of the isometry group admitted by the Riemannian homogeneous spaces, and under some restrictions, any non-linear equivariant function can be expressed as our homogeneous space Volterra convolution, generalizing the non-linear shift equivariant characterization of Volterra expansions in Euclidean space. We also prove that second order functional convolution operations can be represented as cascaded convolutions which leads to an efficient implementation. Beyond this, we also propose a dilated VolterraNet model. These advances lead to large parameter reductions relative to baseline non-Euclidean CNNs. To demonstrate the efficacy of the VolterraNet performance, we present several real data experiments involving classification tasks on spherical-MNIST, atomic energy, Shrec17 data sets, and group testing on diffusion MRI data. Performance comparisons to the state-of-the-art are also presented.
LGApr 13, 2021
Simpler Certified Radius Maximization by Propagating CovariancesXingjian Zhen, Rudrasis Chakraborty, Vikas Singh
One strategy for adversarially training a robust model is to maximize its certified radius -- the neighborhood around a given training sample for which the model's prediction remains unchanged. The scheme typically involves analyzing a "smoothed" classifier where one estimates the prediction corresponding to Gaussian samples in the neighborhood of each sample in the mini-batch, accomplished in practice by Monte Carlo sampling. In this paper, we investigate the hypothesis that this sampling bottleneck can potentially be mitigated by identifying ways to directly propagate the covariance matrix of the smoothed distribution through the network. To this end, we find that other than certain adjustments to the network, propagating the covariances must also be accompanied by additional accounting that keeps track of how the distributional moments transform and interact at each stage in the network. We show how satisfying these criteria yields an algorithm for maximizing the certified radius on datasets including Cifar-10, ImageNet, and Places365 while offering runtime savings on networks with moderate depth, with a small compromise in overall accuracy. We describe the details of the key modifications that enable practical use. Via various experiments, we evaluate when our simplifications are sensible, and what the key benefits and limitations are.
CVDec 18, 2020
Flow-based Generative Models for Learning Manifold to Manifold MappingsXingjian Zhen, Rudrasis Chakraborty, Liu Yang et al.
Many measurements or observations in computer vision and machine learning manifest as non-Euclidean data. While recent proposals (like spherical CNN) have extended a number of deep neural network architectures to manifold-valued data, and this has often provided strong improvements in performance, the literature on generative models for manifold data is quite sparse. Partly due to this gap, there are also no modality transfer/translation models for manifold-valued data whereas numerous such methods based on generative models are available for natural images. This paper addresses this gap, motivated by a need in brain imaging -- in doing so, we expand the operating range of certain generative models (as well as generative models for modality transfer) from natural images to images with manifold-valued measurements. Our main result is the design of a two-stream version of GLOW (flow-based invertible generative models) that can synthesize information of a field of one type of manifold-valued measurements given another. On the theoretical side, we introduce three kinds of invertible layers for manifold-valued data, which are not only analogous to their functionality in flow-based generative models (e.g., GLOW) but also preserve the key benefits (determinants of the Jacobian are easy to calculate). For experiments, on a large dataset from the Human Connectome Project (HCP), we show promising results where we can reliably and accurately reconstruct brain images of a field of orientation distribution functions (ODF) from diffusion tensor images (DTI), where the latter has a $5\times$ faster acquisition time but at the expense of worse angular resolution.
LGJun 22, 2020
C-SURE: Shrinkage Estimator and Prototype Classifier for Complex-Valued Deep LearningYifei Xing, Rudrasis Chakraborty, Minxuan Duan et al.
The James-Stein (JS) shrinkage estimator is a biased estimator that captures the mean of Gaussian random vectors.While it has a desirable statistical property of dominance over the maximum likelihood estimator (MLE) in terms of mean squared error (MSE), not much progress has been made on extending the estimator onto manifold-valued data. We propose C-SURE, a novel Stein's unbiased risk estimate (SURE) of the JS estimator on the manifold of complex-valued data with a theoretically proven optimum over MLE. Adapting the architecture of the complex-valued SurReal classifier, we further incorporate C-SURE into a prototype convolutional neural network (CNN) classifier. We compare C-SURE with SurReal and a real-valued baseline on complex-valued MSTAR and RadioML datasets. C-SURE is more accurate and robust than SurReal, and the shrinkage estimator is always better than MLE for the same prototype classifier. Like SurReal, C-SURE is much smaller, outperforming the real-valued baseline on MSTAR (RadioML) with less than 1 percent (3 percent) of the baseline size
MLMar 30, 2020
ManifoldNorm: Extending normalizations on Riemannian ManifoldsRudrasis Chakraborty
Many measurements in computer vision and machine learning manifest as non-Euclidean data samples. Several researchers recently extended a number of deep neural network architectures for manifold valued data samples. Researchers have proposed models for manifold valued spatial data which are common in medical image processing including processing of diffusion tensor imaging (DTI) where images are fields of $3\times 3$ symmetric positive definite matrices or representation in terms of orientation distribution field (ODF) where the identification is in terms of field on hypersphere. There are other sequential models for manifold valued data that recently researchers have shown to be effective for group difference analysis in study for neuro-degenerative diseases. Although, several of these methods are effective to deal with manifold valued data, the bottleneck includes the instability in optimization for deeper networks. In order to deal with these instabilities, researchers have proposed residual connections for manifold valued data. One of the other remedies to deal with the instabilities including gradient explosion is to use normalization techniques including {\it batch norm} and {\it group norm} etc.. But, so far there is no normalization techniques applicable for manifold valued data. In this work, we propose a general normalization techniques for manifold valued data. We show that our proposed manifold normalization technique have special cases including popular batch norm and group norm techniques. On the experimental side, we focus on two types of manifold valued data including manifold of symmetric positive definite matrices and hypersphere. We show the performance gain in one synthetic experiment for moving MNIST dataset and one real brain image dataset where the representation is in terms of orientation distribution field (ODF).
LGNov 5, 2019
A GMM based algorithm to generate point-cloud and its application to neuroimagingLiu Yang, Rudrasis Chakraborty
Recent years have witnessed the emergence of 3D medical imaging techniques with the development of 3D sensors and technology. Due to the presence of noise in image acquisition, registration researchers focused on an alternative way to represent medical images. An alternative way to analyze medical imaging is by understanding the 3D shapes represented in terms of point-cloud. Though in the medical imaging community, 3D point-cloud processing is not a ``go-to'' choice, it is a ``natural'' way to capture 3D shapes. However, as the number of samples for medical images are small, researchers have used pre-trained models to fine-tune on medical images. Furthermore, due to different modality in medical images, standard generative models can not be used to generate new samples of medical images. In this work, we use the advantage of point-cloud representation of 3D structures of medical images and propose a Gaussian mixture model-based generation scheme. Our proposed method is robust to outliers. Experimental validation has been performed to show that the proposed scheme can generate new 3D structures using interpolation techniques, i.e., given two 3D structures represented as point-clouds, we can generate point-clouds in between. We have also generated new point-clouds for subjects with and without dementia and show that the generated samples are indeed closely matched to the respective training samples from the same class.
IVNov 5, 2019
An "augmentation-free" rotation invariant classification scheme on point-cloud and its application to neuroimagingLiu Yang, Rudrasis Chakraborty
Recent years have witnessed the emergence and increasing popularity of 3D medical imaging techniques with the development of 3D sensors and technology. However, achieving geometric invariance in the processing of 3D medical images is computationally expensive but nonetheless essential due to the presence of possible errors caused by rigid registration techniques. An alternative way to analyze medical imaging is by understanding the 3D shapes represented in terms of point-cloud. Though in the medical imaging community, 3D point-cloud processing is not a "go-to" choice, it is a canonical way to preserve rotation invariance. Unfortunately, due to the presence of discrete topology, one can not use the standard convolution operator on point-cloud. To the best of our knowledge, the existing ways to do "convolution" can not preserve the rotation invariance without explicit data augmentation. Therefore, we propose a rotation invariant convolution operator by inducing topology from hypersphere. Experimental validation has been performed on publicly available OASIS dataset in terms of classification accuracy between subjects with (without) dementia, demonstrating the usefulness of our proposed method in terms of model complexity, classification accuracy, and last but most important invariance to rotations.
CVOct 29, 2019
POIRot: A rotation invariant omni-directional pointnetLiu Yang, Rudrasis Chakraborty, Stella X. Yu
Point-cloud is an efficient way to represent 3D world. Analysis of point-cloud deals with understanding the underlying 3D geometric structure. But due to the lack of smooth topology, and hence the lack of neighborhood structure, standard correlation can not be directly applied on point-cloud. One of the popular approaches to do point correlation is to partition the point-cloud into voxels and extract features using standard 3D correlation. But this approach suffers from sparsity of point-cloud and hence results in multiple empty voxels. One possible solution to deal with this problem is to learn a MLP to map a point or its local neighborhood to a high dimensional feature space. All these methods suffer from a large number of parameters requirement and are susceptible to random rotations. A popular way to make the model "invariant" to rotations is to use data augmentation techniques with small rotations but the potential drawback includes \item more training samples \item susceptible to large rotations. In this work, we develop a rotation invariant point-cloud segmentation and classification scheme based on the omni-directional camera model (dubbed as {\bf POIRot$^1$}). Our proposed model is rotationally invariant and can preserve geometric shape of a 3D point-cloud. Because of the inherent rotation invariant property, our proposed framework requires fewer number of parameters (please see \cite{Iandola2017SqueezeNetAA} and the references therein for motivation of lean models). Several experiments have been performed to show that our proposed method can beat the state-of-the-art algorithms in classification and part segmentation applications.
CVOct 18, 2019
SurReal: Complex-Valued Learning as Principled Transformations on a Scaling and Rotation ManifoldRudrasis Chakraborty, Yifei Xing, Stella Yu
Complex-valued data is ubiquitous in signal and image processing applications, and complex-valued representations in deep learning have appealing theoretical properties. While these aspects have long been recognized, complex-valued deep learning continues to lag far behind its real-valued counterpart. We propose a principled geometric approach to complex-valued deep learning. Complex-valued data could often be subject to arbitrary complex-valued scaling; as a result, real and imaginary components could co-vary. Instead of treating complex values as two independent channels of real values, we recognize their underlying geometry: We model the space of complex numbers as a product manifold of non-zero scaling and planar rotations. Arbitrary complex-valued scaling naturally becomes a group of transitive actions on this manifold. We propose to extend the property instead of the form of real-valued functions to the complex domain. We define convolution as weighted Fréchet mean on the manifold that is equivariant to the group of scaling/rotation actions, and define distance transform on the manifold that is invariant to the action group. The manifold perspective also allows us to define nonlinear activation functions such as tangent ReLU and G-transport, as well as residual connections on the manifold-valued data. We dub our model SurReal, as our experiments on MSTAR and RadioML deliver high performance with only a fractional size of real-valued and complex-valued baseline models.
CVOct 5, 2019
Dilated Convolutional Neural Networks for Sequential Manifold-valued DataXingjian Zhen, Rudrasis Chakraborty, Nicholas Vogt et al.
Efforts are underway to study ways via which the power of deep neural networks can be extended to non-standard data types such as structured data (e.g., graphs) or manifold-valued data (e.g., unit vectors or special matrices). Often, sizable empirical improvements are possible when the geometry of such data spaces are incorporated into the design of the model, architecture, and the algorithms. Motivated by neuroimaging applications, we study formulations where the data are {\em sequential manifold-valued measurements}. This case is common in brain imaging, where the samples correspond to symmetric positive definite matrices or orientation distribution functions. Instead of a recurrent model which poses computational/technical issues, and inspired by recent results showing the viability of dilated convolutional models for sequence prediction, we develop a dilated convolutional neural network architecture for this task. On the technical side, we show how the modules needed in our network can be derived while explicitly taking the Riemannian manifold structure into account. We show how the operations needed can leverage known results for calculating the weighted Fréchet Mean (wFM). Finally, we present scientific results for group difference analysis in Alzheimer's disease (AD) where the groups are derived using AD pathology load: here the model finds several brain fiber bundles that are related to AD even when the subjects are all still cognitively healthy.
CVJun 24, 2019
SurReal: Fréchet Mean and Distance Transform for Complex-Valued Deep LearningRudrasis Chakraborty, Jiayun Wang, Stella X. Yu
We develop a novel deep learning architecture for naturally complex-valued data, which is often subject to complex scaling ambiguity. We treat each sample as a field in the space of complex numbers. With the polar form of a complex-valued number, the general group that acts in this space is the product of planar rotation and non-zero scaling. This perspective allows us to develop not only a novel convolution operator using weighted Fréchet mean (wFM) on a Riemannian manifold, but also a novel fully connected layer operator using the distance to the wFM, with natural equivariant properties to non-zero scaling and planar rotation for the former and invariance properties for the latter. Compared to the baseline approach of learning real-valued neural network models on the two-channel real-valued representation of complex-valued data, our method achieves surreal performance on two publicly available complex-valued datasets: MSTAR on SAR images and RadioML on radio frequency signals. On MSTAR, at 8% of the baseline model size and with fewer than 45,000 parameters, our model improves the target classification accuracy from 94% to 98% on this highly imbalanced dataset. On RadioML, our model achieves comparable RF modulation classification accuracy at 10% of the baseline model size.
CVSep 11, 2018
ManifoldNet: A Deep Network Framework for Manifold-valued DataRudrasis Chakraborty, Jose Bouza, Jonathan Manton et al.
Deep neural networks have become the main work horse for many tasks involving learning from data in a variety of applications in Science and Engineering. Traditionally, the input to these networks lie in a vector space and the operations employed within the network are well defined on vector-spaces. In the recent past, due to technological advances in sensing, it has become possible to acquire manifold-valued data sets either directly or indirectly. Examples include but are not limited to data from omnidirectional cameras on automobiles, drones etc., synthetic aperture radar imaging, diffusion magnetic resonance imaging, elastography and conductance imaging in the Medical Imaging domain and others. Thus, there is need to generalize the deep neural networks to cope with input data that reside on curved manifolds where vector space operations are not naturally admissible. In this paper, we present a novel theoretical framework to generalize the widely popular convolutional neural networks (CNNs) to high dimensional manifold-valued data inputs. We call these networks, ManifoldNets. In ManifoldNets, convolution operation on data residing on Riemannian manifolds is achieved via a provably convergent recursive computation of the weighted Fréchet Mean (wFM) of the given data, where the weights makeup the convolution mask, to be learned. Further, we prove that the proposed wFM layer achieves a contraction mapping and hence ManifoldNet does not need the non-linear ReLU unit used in standard CNNs. We present experiments, using the ManifoldNet framework, to achieve dimensionality reduction by computing the principal linear subspaces that naturally reside on a Grassmannian. The experimental results demonstrate the efficacy of ManifoldNets in the context of classification and reconstruction accuracy.
LGMay 31, 2018
A mixture model for aggregation of multiple pre-trained weak classifiersRudrasis Chakraborty, Chun-Hao Yang, Baba C. Vemuri
Deep networks have gained immense popularity in Computer Vision and other fields in the past few years due to their remarkable performance on recognition/classification tasks surpassing the state-of-the art. One of the keys to their success lies in the richness of the automatically learned features. In order to get very good accuracy, one popular option is to increase the depth of the network. Training such a deep network is however infeasible or impractical with moderate computational resources and budget. The other alternative to increase the performance is to learn multiple weak classifiers and boost their performance using a boosting algorithm or a variant thereof. But, one of the problems with boosting algorithms is that they require a re-training of the networks based on the misclassified samples. Motivated by these problems, in this work we propose an aggregation technique which combines the output of multiple weak classifiers. We formulate the aggregation problem using a mixture model fitted to the trained classifier outputs. Our model does not require any re-training of the `weak' networks and is computationally very fast (takes $<30$ seconds to run in our experiments). Thus, using a less expensive training stage and without doing any re-training of networks, we experimentally demonstrate that it is possible to boost the performance by $12\%$. Furthermore, we present experiments using hand-crafted features and improved the classification performance using the proposed aggregation technique. One of the major advantages of our framework is that our framework allows one to combine features that are very likely to be of distinct dimensions since they are extracted using different networks/algorithms. Our experimental results demonstrate a significant performance gain from the use of our aggregation technique at a very small computational cost.
LGMay 29, 2018
A Statistical Recurrent Model on the Manifold of Symmetric Positive Definite MatricesRudrasis Chakraborty, Chun-Hao Yang, Xingjian Zhen et al.
In a number of disciplines, the data (e.g., graphs, manifolds) to be analyzed are non-Euclidean in nature. Geometric deep learning corresponds to techniques that generalize deep neural network models to such non-Euclidean spaces. Several recent papers have shown how convolutional neural networks (CNNs) can be extended to learn with graph-based data. In this work, we study the setting where the data (or measurements) are ordered, longitudinal or temporal in nature and live on a Riemannian manifold -- this setting is common in a variety of problems in statistical machine learning, vision and medical imaging. We show how recurrent statistical recurrent network models can be defined in such spaces. We give an efficient algorithm and conduct a rigorous analysis of its statistical properties. We perform extensive numerical experiments demonstrating competitive performance with state of the art methods but with significantly less number of parameters. We also show applications to a statistical analysis task in brain imaging, a regime where deep neural network models have only been utilized in limited ways.
CVMay 14, 2018
A CNN for homogneous Riemannian manifolds with applications to NeuroimagingRudrasis Chakraborty, Monami Banerjee, Baba C. Vemuri
Convolutional neural networks are ubiquitous in Machine Learning applications for solving a variety of problems. They however can not be used in their native form when the domain of the data is commonly encountered manifolds such as the sphere, the special orthogonal group, the Grassmanian, the manifold of symmetric positive definite matrices and others. Most recently, generalization of CNNs to data domains such as the 2-sphere has been reported by some research groups, which is referred to as the spherical CNNs (SCNNs). The key property of SCNNs distinct from CNNs is that they exhibit the rotational equivariance property that allows for sharing learned weights within a layer. In this paper, we theoretically generalize the CNNs to Riemannian homogeneous manifolds, that include but are not limited to the aforementioned example manifolds. Our key contributions in this work are: (i) A theorem stating that linear group equivariance systems are fully characterized by correlation of functions on the domain manifold and vice-versa. This is fundamental to the characterization of all linear group equivariant systems and parallels the widely used result in linear system theory for vector spaces. (ii) As a corrolary, we prove the equivariance of the correlation operation to group actions admitted by the input domains which are Riemannian homogeneous manifolds. (iii) We present the first end-to-end deep network architecture for classification of diffusion magnetic resonance image (dMRI) scans acquired from a cohort of 44 Parkinson Disease patients and 50 control/normal subjects. (iv) A proof of concept experiment involving synthetic data generated on the manifold of symmetric positive definite matrices is presented to demonstrate the applicability of our network to other types of domains.
CVMay 3, 2018
Dictionary Learning and Sparse Coding on Statistical ManifoldsRudrasis Chakraborty, Monami Banerjee, Baba C. Vemuri
In this paper, we propose a novel information theoretic framework for dictionary learning (DL) and sparse coding (SC) on a statistical manifold (the manifold of probability distributions). Unlike the traditional DL and SC framework, our new formulation does not explicitly incorporate any sparsity inducing norm in the cost function being optimized but yet yields sparse codes. Our algorithm approximates the data points on the statistical manifold (which are probability distributions) by the weighted Kullback-Leibeler center/mean (KL-center) of the dictionary atoms. The KL-center is defined as the minimizer of the maximum KL-divergence between itself and members of the set whose center is being sought. Further, we prove that the weighted KL-center is a sparse combination of the dictionary atoms. This result also holds for the case when the KL-divergence is replaced by the well known Hellinger distance. From an applications perspective, we present an extension of the aforementioned framework to the manifold of symmetric positive definite matrices (which can be identified with the manifold of zero mean gaussian distributions), $\mathcal{P}_n$. We present experiments involving a variety of dictionary-based reconstruction and classification problems in Computer Vision. Performance of the proposed algorithm is demonstrated by comparing it to several state-of-the-art methods in terms of reconstruction and classification accuracy as well as sparsity of the chosen representation.
LGApr 15, 2018
Generative Adversarial Network based Autoencoder: Application to fault detection problem for closed loop dynamical systemsIndrasis Chakraborty, Rudrasis Chakraborty, Draguna Vrabie
Fault detection problem for closed loop uncertain dynamical systems, is investigated in this paper, using different deep learning based methods. Traditional classifier based method does not perform well, because of the inherent difficulty of detecting system level faults for closed loop dynamical system. Specifically, acting controller in any closed loop dynamical system, works to reduce the effect of system level faults. A novel Generative Adversarial based deep Autoencoder is designed to classify datasets under normal and faulty operating conditions. This proposed network performs significantly well when compared to any available classifier based methods, and moreover, does not require labeled fault incorporated datasets for training purpose. Finally, this aforementioned network's performance is tested on a high complexity building energy system dataset.
CVJul 31, 2017
Statistics on the (compact) Stiefel manifold: Theory and ApplicationsRudrasis Chakraborty, Baba Vemuri
A Stiefel manifold of the compact type is often encountered in many fields of Engineering including, signal and image processing, machine learning, numerical optimization and others. The Stiefel manifold is a Riemannian homogeneous space but not a symmetric space. In previous work, researchers have defined probability distributions on symmetric spaces and performed statistical analysis of data residing in these spaces. In this paper, we present original work involving definition of Gaussian distributions on a homogeneous space and show that the maximum-likelihood estimate of the location parameter of a Gaussian distribution on the homogeneous space yields the Fréchet mean (FM) of the samples drawn from this distribution. Further, we present an algorithm to sample from the Gaussian distribution on the Stiefel manifold and recursively compute the FM of these samples. We also prove the weak consistency of this recursive FM estimator. Several synthetic and real data experiments are then presented, demonstrating the superior computational performance of this estimator over the gradient descent based non-recursive counter part as well as the stochastic gradient descent based method prevalent in literature.
LGFeb 3, 2017
Intrinsic Grassmann Averages for Online Linear, Robust and Nonlinear Subspace LearningRudrasis Chakraborty, Søren Hauberg, Baba C. Vemuri
Principal Component Analysis (PCA) and Kernel Principal Component Analysis (KPCA) are fundamental methods in machine learning for dimensionality reduction. The former is a technique for finding this approximation in finite dimensions and the latter is often in an infinite dimensional Reproducing Kernel Hilbert-space (RKHS). In this paper, we present a geometric framework for computing the principal linear subspaces in both situations as well as for the robust PCA case, that amounts to computing the intrinsic average on the space of all subspaces: the Grassmann manifold. Points on this manifold are defined as the subspaces spanned by $K$-tuples of observations. The intrinsic Grassmann average of these subspaces are shown to coincide with the principal components of the observations when they are drawn from a Gaussian distribution. We show similar results in the RKHS case and provide an efficient algorithm for computing the projection onto the this average subspace. The result is a method akin to KPCA which is substantially faster. Further, we present a novel online version of the KPCA using our geometric framework. Competitive performance of all our algorithms are demonstrated on a variety of real and synthetic data sets.
CVApr 23, 2016
An information theoretic formulation of the Dictionary Learning and Sparse Coding Problems on Statistical ManifoldsRudrasis Chakraborty, Monami Banerjee, Victoria Crawford et al.
In this work, we propose a novel information theoretic framework for dictionary learning (DL) and sparse coding (SC) on a statistical manifold (the manifold of probability distributions). Unlike the traditional DL and SC framework, our new formulation {\it does not explicitly incorporate any sparsity inducing norm in the cost function but yet yields SCs}. Moreover, we extend this framework to the manifold of symmetric positive definite matrices, $\mathcal{P}_n$. Our algorithm approximates the data points, which are probability distributions, by the weighted Kullback-Leibeler center (KL-center) of the dictionary atoms. The KL-center is the minimizer of the maximum KL-divergence between the unknown center and members of the set whose center is being sought. Further, {\it we proved that this KL-center is a sparse combination of the dictionary atoms}. Since, the data reside on a statistical manifold, the data fidelity term can not be as simple as in the case of the vector-space data. We therefore employ the geodesic distance between the data and a sparse approximation of the data element. This cost function is minimized using an acceleterated gradient descent algorithm. An extensive set of experimental results show the effectiveness of our proposed framework. We present several experiments involving a variety of classification problems in Computer Vision applications. Further, we demonstrate the performance of our algorithm by comparing it to several state-of-the-art methods both in terms of classification accuracy and sparsity.
CVMar 13, 2016
An efficient Exact-PGA algorithm for constant curvature manifoldsRudrasis Chakraborty, Dohyung Seo, Baba C. Vemuri
Manifold-valued datasets are widely encountered in many computer vision tasks. A non-linear analog of the PCA, called the Principal Geodesic Analysis (PGA) suited for data lying on Riemannian manifolds was reported in literature a decade ago. Since the objective function in PGA is highly non-linear and hard to solve efficiently in general, researchers have proposed a linear approximation. Though this linear approximation is easy to compute, it lacks accuracy especially when the data exhibits a large variance. Recently, an alternative called exact PGA was proposed which tries to solve the optimization without any linearization. For general Riemannian manifolds, though it gives better accuracy than the original (linearized) PGA, for data that exhibit large variance, the optimization is not computationally efficient. In this paper, we propose an efficient exact PGA for constant curvature Riemannian manifolds (CCM-EPGA). CCM-EPGA differs significantly from existing PGA algorithms in two aspects, (i) the distance between a given manifold-valued data point and the principal submanifold is computed analytically and thus no optimization is required as in existing methods. (ii) Unlike the existing PGA algorithms, the descent into codimension-1 submanifolds does not require any optimization but is accomplished through the use of the Rimeannian inverse Exponential map and the parallel transport operations. We present theoretical and experimental results for constant curvature Riemannian manifolds depicting favorable performance of CCM-EPGA compared to existing PGA algorithms. We also present data reconstruction from principal components and directions which has not been presented in literature in this setting.