Angshul Majumdar

h-index40

59papers

1,122citations

Novelty53%

AI Score56

Ranked #20,167 of 201,326 authors (top 10%)#4,569 in LG (top 11%)

59 Papers

QMOct 19, 2022

Graph Regularized Probabilistic Matrix Factorization for Drug-Drug Interactions Prediction

Stuti Jain, Emilie Chouzenoux, Kriti Kumar et al.

Co-administration of two or more drugs simultaneously can result in adverse drug reactions. Identifying drug-drug interactions (DDIs) is necessary, especially for drug development and for repurposing old drugs. DDI prediction can be viewed as a matrix completion task, for which matrix factorization (MF) appears as a suitable solution. This paper presents a novel Graph Regularized Probabilistic Matrix Factorization (GRPMF) method, which incorporates expert knowledge through a novel graph-based regularization strategy within an MF framework. An efficient and sounded optimization algorithm is proposed to solve the resulting non-convex problem in an alternating fashion. The performance of the proposed method is evaluated through the DrugBank dataset, and comparisons are provided against state-of-the-art techniques. The results demonstrate the superior performance of GRPMF when compared to its counterparts.

CCApr 30

Tensor Spectral Threshold is $\exists\mathbb{R}$-Hard

Angshul Majumdar

We study the decision version of tensor spectral norm from the viewpoint of real algebraic complexity. For a rationally specified tensor, the tensor spectral threshold problem asks whether its spectral norm exceeds a prescribed rational threshold. Since the feasible domain is compact, attainment itself is trivial; the meaningful question is the threshold decision problem. We prove that this problem is $\exists\mathbb{R}$-hard by giving an explicit polynomial-time reduction from bounded quartic equality feasibility. The reduction first transforms bounded quartic feasibility into homogeneous quadratic sphere feasibility by homogenization, box encoding, and quadratic lifting. It then maps the resulting homogeneous quadratic system to a quartic form whose maximum over the unit sphere separates feasible from infeasible instances. Finally, the quartic form is represented as a symmetric order-four tensor, yielding the desired tensor spectral threshold instance. The result shows that the computational obstruction in tensor spectral norm is not merely non-convex optimization or combinatorial hardness, but real algebraic feasibility itself.

CVSep 21, 2023

Synthetic Image Detection: Highlights from the IEEE Video and Image Processing Cup 2022 Student Competition

Davide Cozzolino, Koki Nagano, Lucas Thomaz et al.

The Video and Image Processing (VIP) Cup is a student competition that takes place each year at the IEEE International Conference on Image Processing. The 2022 IEEE VIP Cup asked undergraduate students to develop a system capable of distinguishing pristine images from generated ones. The interest in this topic stems from the incredible advances in the AI-based generation of visual data, with tools that allows the synthesis of highly realistic images and videos. While this opens up a large number of new opportunities, it also undermines the trustworthiness of media content and fosters the spread of disinformation on the internet. Recently there was strong concern about the generation of extremely realistic images by means of editing software that includes the recent technology on diffusion models. In this context, there is a need to develop robust and automatic tools for synthetic image detection.

CCApr 18

$\exists\mathbb{R}$-Completeness of Tensor Degeneracy and a Derandomization Barrier for Hyperdeterminants

Angshul Majumdar

We study the computational complexity of singularity for multilinear maps. While the determinant characterizes singularity for matrices, its multilinear analogue -- the hyperdeterminant -- is defined only in boundary format and quickly becomes algebraically unwieldy. We show that the intrinsic notion of tensor singularity, namely degeneracy, is complete for the existential theory of the reals. The reduction is exact and entirely algebraic: homogeneous quadratic feasibility is reduced to projective bilinear feasibility, then to singular matrix-pencil feasibility, and finally encoded directly as tensor degeneracy. No combinatorial gadgets are used. In boundary format, degeneracy coincides with hyperdeterminant vanishing. We therefore isolate the exact gap between intrinsic tensor singularity and its classical polynomial certificate. We show that deterministic hardness transfer to the hyperdeterminant reduces to selecting a point outside the zero set of a completion polynomial, yielding a structured instance of polynomial identity testing. We further formalize the failure of several natural deterministic embedding strategies. This identifies a sharp frontier: real 3-tensor degeneracy is fully characterized at the level of $\ER$-completeness, while the deterministic complexity of hyperdeterminant vanishing remains tied to a derandomization problem in algebraic complexity.

AIApr 15

The Existential Theory of Research: Why Discovery Is Hard

Angshul Majumdar

Can scientific discovery be made arbitrarily easy by choosing the right representation, collecting enough data, and deploying sufficiently powerful algorithms? This paper argues that the answer is fundamentally negative. We introduce the Existential Theory of Research (ETR), a formal framework that models discovery as the recovery of structured explanations under constraints of representation, observation, and computation. Within this framework, we show that these three components cannot be simultaneously optimized: no method can guarantee universally simple explanations, arbitrarily compressed observations, and efficient exact inference. This limitation is not model-specific, but arises from a synthesis of uncertainty principles in sparse representation, sample complexity bounds in high-dimensional recovery, and the computational hardness of exact inference. We further show that representation mismatch alone can inflate intrinsic simplicity into apparent complexity, rendering otherwise tractable problems observationally and computationally prohibitive. To quantify these effects, we introduce an uncertainty functional that captures the joint difficulty of discovery. The results suggest that scientific difficulty is not accidental, but a structural consequence of the geometry and complexity of inference.

OCMar 20

Constrained Nonnegative Gram Feasibility is $\exists\mathbb{R}$-Complete

Angshul Majumdar

We study the computational complexity of constrained nonnegative Gram feasibility. Given a partially specified symmetric matrix together with affine relations among selected entries, the problem asks whether there exists a nonnegative matrix $H \in \mathbb{R}_+^{n\times r}$ such that $W = HH^\top$ satisfies all specified entries and affine constraints. Such factorizations arise naturally in structured low-rank matrix representations and geometric embedding problems. We prove that this feasibility problem is $\exists\mathbb{R}$-complete already for rank $r=2$. The hardness result is obtained via a polynomial-time reduction from the arithmetic feasibility problem \textsc{ETR-AMI}. The reduction exploits a geometric encoding of arithmetic constraints within rank-$2$ nonnegative Gram representations: by fixing anchor directions in $\mathbb{R}_+^2$ and representing variables through vectors of the form $(x,1)$, addition and multiplication constraints can be realized through inner-product relations. Combined with the semialgebraic formulation of the feasibility conditions, this establishes $\exists\mathbb{R}$-completeness. We further show that the hardness extends to every fixed rank $r\ge 2$. Our results place constrained symmetric nonnegative Gram factorization among the growing family of geometric feasibility problems that are complete for the existential theory of the reals. Finally, we discuss limitations of the result and highlight the open problem of determining the complexity of unconstrained symmetric nonnegative factorization feasibility.

LOMar 20

Diminishing Returns in Expanding Generative Models and Godel-Tarski-Lob Limits

Angshul Majumdar

Modern generative modelling systems are increasingly improved by expanding model capacity, training data, and computational resources. While empirical studies have documented such scaling behaviour across architectures including generative adversarial networks, variational autoencoders, transformer-based models, and diffusion models, the theoretical limits of capability growth in expanding generative systems remain poorly understood. In this paper we develop a general task-space framework for analysing expanding generative reasoning systems. Each system induces a subset of a global task space representing the tasks it can successfully solve, and system capability is measured by the probability mass of this solved-task set under a fixed task distribution. Within this framework we prove a structural result showing that, under mild assumptions, the marginal improvement in solved tasks must converge to zero as system capacity increases. Thus expanding generative systems may continue to gain capability, but the probability mass of newly solvable tasks necessarily diminishes asymptotically. We further provide a prediction-theoretic refinement based on complexity-weighted hypothesis classes inspired by algorithmic probability, yielding quantitative bounds on marginal improvement in prediction settings. Finally, we examine logical reasoning tasks and show that classical results from mathematical logic -- including Rosser incompleteness, Tarski's undefinability theorem, and LÃ¶b's theorem -- imply the persistence of unresolved logical tasks within sufficiently expressive reasoning systems. Together these results provide a mathematical perspective on the asymptotic behaviour of expanding generative systems, showing that long-run capability growth is constrained both by diminishing marginal improvements in task coverage and by fundamental logical limitations on internal reasoning.

AIJan 13

Greedy Is Enough: Sparse Action Discovery in Agentic LLMs

Angshul Majumdar

Modern agentic systems operate in environments with extremely large action spaces, such as tool-augmented language models with thousands of available APIs or retrieval operations. Despite this scale, empirical evidence suggests that only a small subset of actions meaningfully influences performance in a given deployment. Motivated by this observation, we study a contextual linear reward model in which action relevance is governed by a structured sparsity assumption: only a small number of actions have nonzero effects across latent states. We formulate action discovery as a block-sparse recovery problem and analyze a greedy algorithm inspired by Orthogonal Matching Pursuit. Under standard assumptions on incoherence, signal strength, and action coverage, we prove that the greedy procedure exactly recovers the relevant action set with high probability, using a number of samples that scales polynomially in the sparsity level and latent dimension, and only logarithmically in the total number of actions. We further provide estimation error guarantees for refitted parameters and show that the resulting decision rule is near-optimal for new latent states. Complementing these results, we establish information-theoretic lower bounds demonstrating that sparsity and sufficient coverage are necessary for tractability. Together, our results identify sparse action discovery as a fundamental principle underlying large-action decision-making and provide a theoretical foundation for action pruning in agentic systems.

AIJan 13

Sparsity Is Necessary: Polynomial-Time Stability for Agentic LLMs in Large Action Spaces

Angshul Majumdar

Tool-augmented LLM systems expose a control regime that learning theory has largely ignored: sequential decision-making with a massive discrete action universe (tools, APIs, documents) in which only a small, unknown subset is relevant for any fixed task distribution. We formalize this setting as Sparse Agentic Control (SAC), where policies admit block-sparse representations over M >> 1 actions and rewards depend on sparse main effects and (optionally) sparse synergies. We study ell_{1,2}-regularized policy learning through a convex surrogate and establish sharp, compressed-sensing-style results: (i) estimation and value suboptimality scale as k (log M / T)^{1/2} under a Policy-RSC condition; (ii) exact tool-support recovery holds via primal-dual witness arguments when T > k log M under incoherence and beta-min; and (iii) any dense policy class requires Omega(M) samples, explaining the instability of prompt-only controllers. We further show that under partial observability, LLMs matter only through a belief/representation error epsilon_b, yielding an additive O(epsilon_b) degradation while preserving logarithmic dependence on M. Extensions cover tuning-free, online, robust, group-sparse, and interaction-aware SAC.

LGDec 25, 2025

Dictionary-Transform Generative Adversarial Networks

Angshul Majumdar

Generative adversarial networks (GANs) are widely used for distribution learning, yet their classical formulations remain theoretically fragile, with ill-posed objectives, unstable training dynamics, and limited interpretability. In this work, we introduce \emph{Dictionary-Transform Generative Adversarial Networks} (DT-GAN), a fully model-based adversarial framework in which the generator is a sparse synthesis dictionary and the discriminator is an analysis transform acting as an energy model. By restricting both players to linear operators with explicit constraints, DT-GAN departs fundamentally from neural GAN architectures and admits rigorous theoretical analysis. We show that the DT-GAN adversarial game is well posed and admits at least one Nash equilibrium. Under a sparse generative model, equilibrium solutions are provably identifiable up to standard permutation and sign ambiguities and exhibit a precise geometric alignment between synthesis and analysis operators. We further establish finite-sample stability and consistency of empirical equilibria, demonstrating that DT-GAN training converges reliably under standard sampling assumptions and remains robust in heavy-tailed regimes. Experiments on mixture-structured synthetic data validate the theoretical predictions, showing that DT-GAN consistently recovers underlying structure and exhibits stable behavior under identical optimization budgets where a standard GAN degrades. DT-GAN is not proposed as a universal replacement for neural GANs, but as a principled adversarial alternative for data distributions that admit sparse synthesis structure. The results demonstrate that adversarial learning can be made interpretable, stable, and provably correct when grounded in classical sparse modeling.

GTDec 25, 2025

Bayesian Recovery for Probabilistic Coalition Structures

Angshul Majumdar

Probabilistic Coalition Structure Generation (PCSG) is NP-hard and can be recast as an $l_0$-type sparse recovery problem by representing coalition structures as sparse coefficient vectors over a coalition-incidence design. A natural question is whether standard sparse methods, such as $l_1$ relaxations and greedy pursuits, can reliably recover the optimal coalition structure in this setting. We show that the answer is negative in a PCSG-inspired regime where overlapping coalitions generate highly coherent, near-duplicate columns: the irrepresentable condition fails for the design, and $k$-step Orthogonal Matching Pursuit (OMP) exhibits a nonvanishing probability of irreversible mis-selection. In contrast, we prove that Sparse Bayesian Learning (SBL) with a Gaussian-Gamma hierarchy is support consistent under the same structural assumptions. The concave sparsity penalty induced by SBL suppresses spurious near-duplicates and recovers the true coalition support with probability tending to one. This establishes a rigorous separation between convex, greedy, and Bayesian sparse approaches for PCSG.

GTDec 25, 2025

Near-Optimal Coalition Structures in Polynomial Time

Angshul Majumdar

We study the classical coalition structure generation (CSG) problem and compare the anytime behavior of three algorithmic paradigms: dynamic programming (DP), MILP branch-and-bound, and sparse relaxations based on greedy or $l_1$-type methods. Under a simple random "sparse synergy" model for coalition values, we prove that sparse relaxations recover coalition structures whose welfare is arbitrarily close to optimal in polynomial time with high probability. In contrast, broad classes of DP and MILP algorithms require exponential time before attaining comparable solution quality. This establishes a rigorous probabilistic anytime separation in favor of sparse relaxations, even though exact methods remain ultimately optimal.

GTJan 1

Sparse Probabilistic Coalition Structure Generation: Bayesian Greedy Pursuit and $\ell_1$ Relaxations

Angshul Majumdar

We study coalition structure generation (CSG) when coalition values are not given but must be learned from episodic observations. We model each episode as a sparse linear regression problem, where the realised payoff $Y_t$ is a noisy linear combination of a small number of coalition contributions. This yields a probabilistic CSG framework in which the planner first estimates a sparse value function from $T$ episodes, then runs a CSG solver on the inferred coalition set. We analyse two estimation schemes. The first, Bayesian Greedy Coalition Pursuit (BGCP), is a greedy procedure that mimics orthogonal matching pursuit. Under a coherence condition and a minimum signal assumption, BGCP recovers the true set of profitable coalitions with high probability once $T \gtrsim K \log m$, and hence yields welfare-optimal structures. The second scheme uses an $\ell_1$-penalised estimator; under a restricted eigenvalue condition, we derive $\ell_1$ and prediction error bounds and translate them into welfare gap guarantees. We compare both methods to probabilistic baselines and identify regimes where sparse probabilistic CSG is superior, as well as dense regimes where classical least-squares approaches are competitive.

STNov 21, 2023

Deep State-Space Model for Predicting Cryptocurrency Price

Shalini Sharma, Angshul Majumdar, Emilie Chouzenoux et al.

Our work presents two fundamental contributions. On the application side, we tackle the challenging problem of predicting day-ahead crypto-currency prices. On the methodological side, a new dynamical modeling approach is proposed. Our approach keeps the probabilistic formulation of the state-space model, which provides uncertainty quantification on the estimates, and the function approximation ability of deep neural networks. We call the proposed approach the deep state-space model. The experiments are carried out on established cryptocurrencies (obtained from Yahoo Finance). The goal of the work has been to predict the price for the next day. Benchmarking has been done with both state-of-the-art and classical dynamical modeling techniques. Results show that the proposed approach yields the best overall results in terms of accuracy.

ITMar 18

Shannon meets GÃ¶del-Tarski-LÃ¶b: Undecidability of Shannon Feedback Capacity for Finite-State Channels

Angshul Majumdar

We study the exact decision problem for feedback capacity of finite-state channels (FSCs). Given an encoding $e$ of a binary-input binary-output rational unifilar FSC with specified rational initial distribution, and a rational threshold $q$, we ask whether the feedback capacity satisfies $C_{fb}(W_e, Ï_{1,e}) \ge q$. We prove that this exact threshold problem is undecidable, even when restricted to a severely constrained class of rational unifilar FSCs with bounded state space. The reduction is effective and preserves rationality of all channel parameters. As a structural consequence, the exact threshold predicate does not lie in the existential theory of the reals ($\exists\mathbb{R}$), and therefore cannot admit a universal reduction to finite systems of polynomial equalities and inequalities over the real numbers. In particular, there is no algorithm deciding all instances of the exact feedback-capacity threshold problem within this class. These results do not preclude approximation schemes or solvability for special subclasses; rather, they establish a fundamental limitation for exact feedback-capacity reasoning in general finite-state settings. At the metatheoretic level, the undecidability result entails corresponding GÃ¶del-Tarski-LÃ¶b incompleteness phenomena for sufficiently expressive formal theories capable of representing the threshold predicate.

CCApr 24

How Hard Is Continuous Clustering? Lower Bounds from the Existential Theory of the Reals

Angshul Majumdar

This paper studies the computational difficulty of clustering problems that are defined directly on a continuous probability density. Rather than working with finite samples, we assume the density is given as a polynomial and ask whether it contains certain cluster structures. Four natural questions are examined. First, do there exist several points with high density that are far apart from each other. Second, do two high density points have a midpoint with low density, creating a valley between them. Third, does the region where the density is above a threshold have at least a given number of separate connected pieces. Fourth, does that same region contain a hole, meaning a loop that cannot be shrunk to a point. We prove that the first two problems, separated points and valley detection, are exactly as hard as the existential theory of the reals, a complexity class that contains NP and is believed to be strictly larger. In contrast, the topological problems of counting connected pieces and detecting holes are at least as hard as the existential theory of the reals, but their exact complexity remains open. Placing them inside that class would need a major advance in real algebraic geometry. These results give the first rigorous classification of exact continuous clustering inside the real polynomial hierarchy. They also show that even basic clustering criteria are not NP complete unless unexpected collapses occur.

LGOct 24, 2025

A Unified Matrix Factorization Framework for Classical and Robust Clustering

Angshul Majumdar

This paper presents a unified matrix factorization framework for classical and robust clustering. We begin by revisiting the well-known equivalence between crisp k-means clustering and matrix factorization, following and rigorously rederiving an unpublished formulation by Bauckhage. Extending this framework, we derive an analogous matrix factorization interpretation for fuzzy c-means clustering, which to the best of our knowledge has not been previously formalized. These reformulations allow both clustering paradigms to be expressed as optimization problems over factor matrices, thereby enabling principled extensions to robust variants. To address sensitivity to outliers, we propose robust formulations for both crisp and fuzzy clustering by replacing the Frobenius norm with the l1,2-norm, which penalizes the sum of Euclidean norms across residual columns. We develop alternating minimization algorithms for the standard formulations and IRLS-based algorithms for the robust counterparts. All algorithms are theoretically proven to converge to a local minimum.

LGNov 27, 2021

Transformed K-means Clustering

Anurag Goel, Angshul Majumdar

In this work we propose a clustering framework based on the paradigm of transform learning. In simple terms the representation from transform learning is used for K-means clustering; however, the problem is not solved in such a naïve piecemeal fashion. The K-means clustering loss is embedded into the transform learning framework and the joint problem is solved using the alternating direction method of multipliers. Results on document clustering show that our proposed approach improves over the state-of-the-art.

CVNov 27, 2021

Sparse Subspace Clustering Friendly Deep Dictionary Learning for Hyperspectral Image Classification

Anurag Goel, Angshul Majumdar

Subspace clustering techniques have shown promise in hyperspectral image segmentation. The fundamental assumption in subspace clustering is that the samples belonging to different clusters/segments lie in separable subspaces. What if this condition does not hold? We surmise that even if the condition does not hold in the original space, the data may be nonlinearly transformed to a space where it will be separable into subspaces. In this work, we propose a transformation based on the tenets of deep dictionary learning (DDL). In particular, we incorporate the sparse subspace clustering (SSC) loss in the DDL formulation. Here DDL nonlinearly transforms the data such that the transformed representation (of the data) is separable into subspaces. We show that the proposed formulation improves over the state-of-the-art deep learning techniques in hyperspectral image clustering.

CPNov 9, 2020

SuperDeConFuse: A Supervised Deep Convolutional Transform based Fusion Framework for Financial Trading Systems

Pooja Gupta, Angshul Majumdar, Emilie Chouzenoux et al.

This work proposes a supervised multi-channel time-series learning framework for financial stock trading. Although many deep learning models have recently been proposed in this domain, most of them treat the stock trading time-series data as 2-D image data, whereas its true nature is 1-D time-series data. Since the stock trading systems are multi-channel data, many existing techniques treating them as 1-D time-series data are not suggestive of any technique to effectively fusion the information carried by the multiple channels. To contribute towards both of these shortcomings, we propose an end-to-end supervised learning framework inspired by the previously established (unsupervised) convolution transform learning framework. Our approach consists of processing the data channels through separate 1-D convolution layers, then fusing the outputs with a series of fully-connected layers, and finally applying a softmax classification layer. The peculiarity of our framework - SuperDeConFuse (SDCF), is that we remove the nonlinear activation located between the multi-channel convolution layers and the fully-connected layers, as well as the one located between the latter and the output layer. We compensate for this removal by introducing a suitable regularization on the aforementioned layer outputs and filters during the training phase. Specifically, we apply a logarithm determinant regularization on the layer filters to break symmetry and force diversity in the learnt transforms, whereas we enforce the non-negativity constraint on the layer outputs to mitigate the issue of dead neurons. This results in the effective learning of a richer set of features and filters with respect to a standard convolutional neural network. Numerical experiments confirm that the proposed model yields considerably better results than state-of-the-art deep learning techniques for real-world problem of stock trading.

LGNov 9, 2020

DeConFuse : A Deep Convolutional Transform based Unsupervised Fusion Framework

Pooja Gupta, Jyoti Maggu, Angshul Majumdar et al.

This work proposes an unsupervised fusion framework based on deep convolutional transform learning. The great learning ability of convolutional filters for data analysis is well acknowledged. The success of convolutive features owes to convolutional neural network (CNN). However, CNN cannot perform learning tasks in an unsupervised fashion. In a recent work, we show that such shortcoming can be addressed by adopting a convolutional transform learning (CTL) approach, where convolutional filters are learnt in an unsupervised fashion. The present paper aims at (i) proposing a deep version of CTL; (ii) proposing an unsupervised fusion formulation taking advantage of the proposed deep CTL representation; (iii) developing a mathematically sounded optimization strategy for performing the learning task. We apply the proposed technique, named DeConFuse, on the problem of stock forecasting and trading. Comparison with state-of-the-art methods (based on CNN and long short-term memory network) shows the superiority of our method for performing a reliable feature extraction.

LGNov 9, 2020

ConFuse: Convolutional Transform Learning Fusion Framework For Multi-Channel Data Analysis

Pooja Gupta, Jyoti Maggu, Angshul Majumdar et al.

This work addresses the problem of analyzing multi-channel time series data %. In this paper, we by proposing an unsupervised fusion framework based on %the recently proposed convolutional transform learning. Each channel is processed by a separate 1D convolutional transform; the output of all the channels are fused by a fully connected layer of transform learning. The training procedure takes advantage of the proximal interpretation of activation functions. We apply the developed framework to multi-channel financial data for stock forecasting and trading. We compare our proposed formulation with benchmark deep time series analysis networks. The results show that our method yields considerably better results than those compared against.

LGOct 2, 2020

Deep Convolutional Transform Learning -- Extended version

Jyoti Maggu, Angshul Majumdar, Emilie Chouzenoux et al.

This work introduces a new unsupervised representation learning technique called Deep Convolutional Transform Learning (DCTL). By stacking convolutional transforms, our approach is able to learn a set of independent kernels at different layers. The features extracted in an unsupervised manner can then be used to perform machine learning tasks, such as classification and clustering. The learning technique relies on a well-sounded alternating proximal minimization scheme with established convergence guarantees. Our experimental results show that the proposed DCTL technique outperforms its shallow version CTL, on several benchmark datasets.

SPDec 11, 2019

Semi-supervised Stacked Label Consistent Autoencoder for Reconstruction and Analysis of Biomedical Signals

Anupriya Gogna, Angshul Majumdar, Rabab Ward

In this work we propose an autoencoder based framework for simultaneous reconstruction and classification of biomedical signals. Previously these two tasks, reconstruction and classification were treated as separate problems. This is the first work to propose a combined framework to address the issue in a holistic fashion. Reconstruction techniques for biomedical signals for tele-monitoring are largely based on compressed sensing (CS) based method, these are designed techniques where the reconstruction formulation is based on some assumption regarding the signal. In this work, we propose a new paradigm for reconstruction we learn to reconstruct. An autoencoder can be trained for the same. But since the final goal is to analyze classify the signal we learn a linear classification map inside the autoencoder. The ensuing optimization problem is solved using the Split Bregman technique. Experiments have been carried out on reconstruction and classification of ECG arrhythmia classification and EEG seizure classification signals. Our proposed tool is capable of operating in a semi-supervised fashion. We show that our proposed method is better and more than an order magnitude faster in reconstruction than CS based methods; it is capable of real-time operation. Our method is also better than recently proposed classification methods. Significance: This is the first work offering an alternative to CS based reconstruction. It also shows that representation learning can yield better results than hand-crafted features for signal analysis.

IVDec 11, 2019

RODEO: Robust DE-aliasing autoencOder for Real-time Medical Image Reconstruction

Janki Mehta, Angshul Majumdar

In this work we address the problem of real-time dynamic medical MRI and X Ray CT image reconstruction from parsimonious samples Fourier frequency space for MRI and sinogram tomographic projections for CT. Today the de facto standard for such reconstruction is compressed sensing. CS produces high quality images (with minimal perceptual loss, but such reconstructions are time consuming, requiring solving a complex optimization problem. In this work we propose to learn the reconstruction from training samples using an autoencoder. Our work is based on the universal function approximation capacity of neural networks. The training time for the autoencoder is large, but is offline and hence does not affect performance during operation. During testing or operation, our method requires only a few matrix vector products and hence is significantly faster than CS based methods. In fact, it is fast enough for real-time reconstruction the images are reconstructed as fast as they are acquired with only slight degradation of image quality. However, in order to make the autoencoder suitable for our problem, we depart from the standard Euclidean norm cost function of autoencoders and use a robust l1-norm instead. The ensuing problem is solved using the Split Bregman method.

LGDec 11, 2019

Majorization Minimization Technique for Optimally Solving Deep Dictionary Learning

Vanika Singhal, Angshul Majumdar

The concept of deep dictionary learning has been recently proposed. Unlike shallow dictionary learning which learns single level of dictionary to represent the data, it uses multiple layers of dictionaries. So far, the problem could only be solved in a greedy fashion; this was achieved by learning a single layer of dictionary in each stage where the coefficients from the previous layer acted as inputs to the subsequent layer (only the first layer used the training samples as inputs). This was not optimal; there was feedback from shallower to deeper layers but not the other way. This work proposes an optimal solution to deep dictionary learning whereby all the layers of dictionaries are solved simultaneously. We employ the Majorization Minimization approach. Experiments have been carried out on benchmark datasets; it shows that optimal learning indeed improves over greedy piecemeal learning. Comparison with other unsupervised deep learning tools (stacked denoising autoencoder, deep belief network, contractive autoencoder and K-sparse autoencoder) show that our method supersedes their performance both in accuracy and speed.

SPDec 11, 2019

Deep Sparse Coding for Non-Intrusive Load Monitoring

Shikha Singh, Angshul Majumdar

Energy disaggregation is the task of segregating the aggregate energy of the entire building (as logged by the smartmeter) into the energy consumed by individual appliances. This is a single channel (the only channel being the smart-meter) blind source (different electrical appliances) separation problem. The traditional way to address this is via stochastic finite state machines (e.g. Factorial Hidden Markov Model). In recent times dictionary learning based approaches have shown promise in addressing the disaggregation problem. The usual technique is to learn a dictionary for every device and use the learnt dictionaries as basis for blind source separation during disaggregation. Prior studies in this area are shallow learning techniques, i.e. they learn a single layer of dictionary for every device. In this work, we propose a deep learning approach, instead of learning one level of dictionary, we learn multiple layers of dictionaries for each device. These multi-level dictionaries are used as a basis for source separation during disaggregation. Results on two benchmark datasets and one actual implementation show that our method outperforms state-of-the-art techniques.

IRDec 11, 2019

Balancing Accuracy and Diversity in Recommendations using Matrix Completion Framework

Anupriya Gogna, Angshul Majumdar

Design of recommender systems aimed at achieving high prediction accuracy is a widely researched area. However, several studies have suggested the need for diversified recommendations, with acceptable level of accuracy, to avoid monotony and improve customers experience. However, increasing diversity comes with an associated reduction in recommendation accuracy; thereby necessitating an optimum tradeoff between the two. In this work, we attempt to achieve accuracy vs diversity balance, by exploiting available ratings and item metadata, through a single, joint optimization model built over the matrix completion framework. Most existing works, unlike our formulation, propose a 2 stage model, a heuristic item ranking scheme on top of an existing collaborative filtering technique. Experimental evaluation on a movie recommender system indicates that our model achieves higher diversity for a given drop in accuracy as compared to existing state of the art techniques.

IVDec 11, 2019

Discriminative Robust Deep Dictionary Learning for Hyperspectral Image Classification

Vanika Singhal, Hemant K. Aggarwal, Snigdha Tariyal et al.

This work proposes a new framework for deep learning that has been particularly tailored for hyperspectral image classification. We learn multiple levels of dictionaries in a robust fashion. The last layer is discriminative that learns a linear classifier. The training proceeds greedily, at a time a single level of dictionary is learnt and the coefficients used to train the next level. The coefficients from the final level are used for classification. Robustness is incorporated by minimizing the absolute deviations instead of the more popular Euclidean norm. The inbuilt robustness helps combat mixed noise (Gaussian and sparse) present in hyperspectral images. Results show that our proposed techniques outperforms all other deep learning methods Deep Belief Network (DBN), Stacked Autoencoder (SAE) and Convolutional Neural Network (CNN). The experiments have been carried out on benchmark hyperspectral imaging datasets.

CVDec 11, 2019

Kernel Transform Learning

Jyoti Maggu, Angshul Majumdar

This work proposes kernel transform learning. The idea of dictionary learning is well known; it is a synthesis formulation where a basis is learnt along with the coefficients so as to generate or synthesize the data. Transform learning is its analysis equivalent; the transforms operates or analyses on the data to generate the coefficients. The concept of kernel dictionary learning has been introduced in the recent past, where the dictionary is represented as a linear combination of non-linear version of the data. Its success has been showcased in feature extraction. In this work we propose to kernelize transform learning on line similar to kernel dictionary learning. An efficient solution for kernel transform learning has been proposed especially for problems where the number of samples is much larger than the dimensionality of the input samples making the kernel matrix very high dimensional. Kernel transform learning has been compared with other representation learning tools like autoencoder, restricted Boltzmann machine as well as with dictionary learning (and its kernelized version). Our proposed kernel transform learning yields better results than all the aforesaid techniques; experiments have been carried out on benchmark databases.

IVDec 11, 2019

Multi-echo Reconstruction from Partial K-space Scans via Adaptively Learnt Basis

Jyoti Maggu, Prerna Singh, Angshul Majumdar

In multi echo imaging, multiple T1/T2 weighted images of the same cross section is acquired. Acquiring multiple scans is time consuming. In order to accelerate, compressed sensing based techniques have been proposed. In recent times, it has been observed in several areas of traditional compressed sensing, that instead of using fixed basis (wavelet, DCT etc.), considerably better results can be achieved by learning the basis adaptively from the data. Motivated by these studies, we propose to employ such adaptive learning techniques to improve reconstruction of multi-echo scans. This work will be based on two basis learning models synthesis (better known as dictionary learning) and analysis (known as transform learning). We modify these basic methods by incorporating structure of the multi echo scans. Our work shows that we can indeed significantly improve multi-echo imaging over compressed sensing based techniques and other unstructured adaptive sparse recovery methods.

SPDec 11, 2019

Analysis Co-Sparse Coding for Energy Disaggregation

Shikha Singh, Angshul Majumdar

Energy disaggregation is the task of segregating the aggregate energy of the entire building (as logged by the smartmeter) into the energy consumed by individual appliances. This is a single channel (the only channel being the smart-meter) blind source (different electrical appliances) separation problem. In recent times dictionary learning based approaches have shown promise in addressing the disaggregation problem. The usual technique is to learn a dictionary for every device and use the learnt dictionaries as basis for blind source separation during disaggregation. Dictionary learning is a synthesis formulation; in this work, we propose an analysis approach. The advantage of our proposed approach is that, the requirement of training volume drastically reduces compared to state-of-the-art techniques. This means that, we require fewer instrumented homes, or fewer days of instrumentation per home; in either case this drastically reduces the sensing cost. Results on two benchmark datasets show that our method produces the same level of disaggregation accuracy as state-of-the-art methods but with only a fraction of the training data.

SPDec 11, 2019

Simultaneous Detection of Multiple Appliances from Smart-meter Measurements via Multi-Label Consistent Deep Dictionary Learning and Deep Transform Learning

Vanika Singhal, Jyoti Maggu, Angshul Majumdar

Currently there are several well-known approaches to non-intrusive appliance load monitoring rule based, stochastic finite state machines, neural networks and sparse coding. Recently several studies have proposed a new approach based on multi label classification. Different appliances are treated as separate classes, and the task is to identify the classes given the aggregate smart-meter reading. Prior studies in this area have used off the shelf algorithms like MLKNN and RAKEL to address this problem. In this work, we propose a deep learning based technique. There are hardly any studies in deep learning based multi label classification; two new deep learning techniques to solve the said problem are fundamental contributions of this work. These are deep dictionary learning and deep transform learning. Thorough experimental results on benchmark datasets show marked improvement over existing studies.

SPDec 11, 2019

Blind Denoising Autoencoder

Angshul Majumdar

The term blind denoising refers to the fact that the basis used for denoising is learnt from the noisy sample itself during denoising. Dictionary learning and transform learning based formulations for blind denoising are well known. But there has been no autoencoder based solution for the said blind denoising approach. So far autoencoder based denoising formulations have learnt the model on a separate training data and have used the learnt model to denoise test samples. Such a methodology fails when the test image (to denoise) is not of the same kind as the models learnt with. This will be first work, where we learn the autoencoder from the noisy sample while denoising. Experimental results show that our proposed method performs better than dictionary learning (KSVD), transform learning, sparse stacked denoising autoencoder and the gold standard BM3D algorithm.

CVDec 11, 2019

Discriminative Autoencoder for Feature Extraction: Application to Character Recognition

Anupriya Gogna, Angshul Majumdar

Conventionally, autoencoders are unsupervised representation learning tools. In this work, we propose a novel discriminative autoencoder. Use of supervised discriminative learning ensures that the learned representation is robust to variations commonly encountered in image datasets. Using the basic discriminating autoencoder as a unit, we build a stacked architecture aimed at extracting relevant representation from the training data. The efficiency of our feature extraction algorithm ensures a high classification accuracy with even simple classification schemes like KNN (K-nearest neighbor). We demonstrate the superiority of our model for representation learning by conducting experiments on standard datasets for character/image recognition and subsequent comparison with existing supervised deep architectures like class sparse stacked autoencoder and discriminative deep belief network.

IVDec 11, 2019

Row-Sparse Discriminative Deep Dictionary Learning for Hyperspectral Image Classification

Vanika Singhal, Angshul Majumdar

In recent studies in hyperspectral imaging, biometrics and energy analytics, the framework of deep dictionary learning has shown promise. Deep dictionary learning outperforms other traditional deep learning tools when training data is limited; therefore hyperspectral imaging is one such example that benefits from this framework. Most of the prior studies were based on the unsupervised formulation; and in all cases, the training algorithm was greedy and hence sub-optimal. This is the first work that shows how to learn the deep dictionary learning problem in a joint fashion. Moreover, we propose a new discriminative penalty to the said framework. The third contribution of this work is showing how to incorporate stochastic regularization techniques into the deep dictionary learning framework. Experimental results on hyperspectral image classification shows that the proposed technique excels over all state-of-the-art deep and shallow (traditional) learning based methods published in recent times.

IVDec 11, 2019

Label Consistent Transform Learning for Hyperspectral Image Classification

Jyoti Maggu, Hemant K. Aggarwal, Angshul Majumdar

This work proposes a new image analysis tool called Label Consistent Transform Learning (LCTL). Transform learning is a recent unsupervised representation learning approach; we add supervision by incorporating a label consistency constraint. The proposed technique is especially suited for hyper-spectral image classification problems owing to its ability to learn from fewer samples. We have compared our proposed method on state-of-the-art techniques like label consistent KSVD, Stacked Autoencoder, Deep Belief Network and Convolutional Neural Network. Our method yields considerably better results (more than 0.1 improvement in Kappa coefficient) than all the aforesaid techniques.

LGDec 11, 2019

Recurrent Transform Learning

Megha Gupta, Angshul Majumdar

The objective of this work is to improve the accuracy of building demand forecasting. This is a more challenging task than grid level forecasting. For the said purpose, we develop a new technique called recurrent transform learning (RTL). Two versions are proposed. The first one (RTL) is unsupervised; this is used as a feature extraction tool that is further fed into a regression model. The second formulation embeds regression into the RTL framework leading to regressing recurrent transform learning (R2TL). Forecasting experiments have been carried out on three popular publicly available datasets. Both of our proposed techniques yield results superior to the state-of-the-art like long short term memory network, echo state network and sparse coding regression.

SPDec 11, 2019

Non-intrusive Load Monitoring via Multi-label Sparse Representation based Classification

Shikha Singh, Angshul Majumdar

This work follows the approach of multi-label classification for non-intrusive load monitoring (NILM). We modify the popular sparse representation based classification (SRC) approach (developed for single label classification) to solve multi-label classification problems. Results on benchmark REDD and Pecan Street dataset shows significant improvement over state-of-the-art techniques with small volume of training data.

LGDec 10, 2019

Deep Latent Factor Model for Collaborative Filtering

Aanchal Mongia, Neha Jhamb, Emilie Chouzenoux et al.

Latent factor models have been used widely in collaborative filtering based recommender systems. In recent years, deep learning has been successful in solving a wide variety of machine learning problems. Motivated by the success of deep learning, we propose a deeper version of latent factor model. Experiments on benchmark datasets shows that our proposed technique significantly outperforms all state-of-the-art collaborative filtering techniques.

LGDec 10, 2019

Transformed Subspace Clustering

Jyoti Maggu, Angshul Majumdar, Emilie Chouzenoux

Subspace clustering assumes that the data is sepa-rable into separate subspaces. Such a simple as-sumption, does not always hold. We assume that, even if the raw data is not separable into subspac-es, one can learn a representation (transform coef-ficients) such that the learnt representation is sep-arable into subspaces. To achieve the intended goal, we embed subspace clustering techniques (locally linear manifold clustering, sparse sub-space clustering and low rank representation) into transform learning. The entire formulation is jointly learnt; giving rise to a new class of meth-ods called transformed subspace clustering (TSC). In order to account for non-linearity, ker-nelized extensions of TSC are also proposed. To test the performance of the proposed techniques, benchmarking is performed on image clustering and document clustering datasets. Comparison with state-of-the-art clustering techniques shows that our formulation improves upon them.

LGDec 10, 2019

Reconstructing Multi-echo Magnetic Resonance Images via Structured Deep Dictionary Learning

Vanika Singhal, Angshul Majumdar

Multi-echo magnetic resonance (MR) images are acquired by changing the echo times (for T2 weighted) or relaxation times (for T1 weighted) of scans. The resulting (multi-echo) images are usually used for quantitative MR imaging. Acquiring MR images is a slow process and acquiring multi scans of the same cross section for multi-echo imaging is even slower. In order to accelerate the scan, compressed sensing (CS) based techniques have been advocating partial K-space (Fourier domain) scans; the resulting images are reconstructed via structured CS algorithms. In recent times, it has been shown that instead of using off-the-shelf CS, better results can be obtained by adaptive reconstruction algorithms based on structured dictionary learning. In this work, we show that the reconstruction results can be further improved by using structured deep dictionaries. Experimental results on real datasets show that by using our proposed technique the scan-time can be cut by half compared to the state-of-the-art.

LGOct 17, 2019

Multi Label Restricted Boltzmann Machine for Non-Intrusive Load Monitoring

Sagar Verma, Shikha Singh, Angshul Majumdar

Increasing population indicates that energy demands need to be managed in the residential sector. Prior studies have reflected that the customers tend to reduce a significant amount of energy consumption if they are provided with appliance-level feedback. This observation has increased the relevance of load monitoring in today's tech-savvy world. Most of the previously proposed solutions claim to perform load monitoring without intrusion, but they are not completely non-intrusive. These methods require historical appliance-level data for training the model for each of the devices. This data is gathered by putting a sensor on each of the appliances present in the home which causes intrusion in the building. Some recent studies have proposed that if we frame Non-Intrusive Load Monitoring (NILM) as a multi-label classification problem, the need for appliance-level data can be avoided. In this paper, we propose Multi-label Restricted Boltzmann Machine(ML-RBM) for NILM and report an experimental evaluation of proposed and state-of-the-art techniques.

LGOct 17, 2019

Collaborative Filtering with Label Consistent Restricted Boltzmann Machine

Sagar Verma, Prince Patel, Angshul Majumdar

The possibility of employing restricted Boltzmann machine (RBM) for collaborative filtering has been known for about a decade. However, there has been hardly any work on this topic since 2007. This work revisits the application of RBM in recommender systems. RBM based collaborative filtering only used the rating information; this is an unsupervised architecture. This work adds supervision by exploiting user demographic information and item metadata. A network is learned from the representation layer to the labels (metadata). The proposed label consistent RBM formulation improves significantly on the existing RBM based approach and yield results at par with the state-of-the-art latent factor based models.

CVDec 24, 2018

Motion Blur removal via Coupled Autoencoder

Kavya Gupta, Brojeshwar Bhowmick, Angshul Majumdar

In this paper a joint optimization technique has been proposed for coupled autoencoder which learns the autoencoder weights and coupling map (between source and target) simultaneously. The technique is applicable to any transfer learning problem. In this work, we propose a new formulation that recasts deblurring as a transfer learning problem, it is solved using the proposed coupled autoencoder. The proposed technique can operate on-the-fly, since it does not require solving any costly inverse problem. Experiments have been carried out on state-of-the-art techniques, our method yields better quality images in shorter operating times.

CVDec 24, 2018

Coupled Analysis Dictionary Learning to inductively learn inversion: Application to real-time reconstruction of Biomedical signals

Kavya Gupta, Brojeshwar Bhowmick, Angshul Majumdar

This work addresses the problem of reconstructing biomedical signals from their lower dimensional projections. Traditionally Compressed Sensing (CS) based techniques have been employed for this task. These are transductive inversion processes, the problem with these approaches is that the inversion is time-consuming and hence not suitable for real-time applications. With the recent advent of deep learning, Stacked Sparse Denoising Autoencoder (SSDAE) has been used for learning inversion in an inductive setup. The training period for inductive learning is large but is very fast during application -- capable of real-time speed. This work proposes a new approach for inductive learning of the inversion process. It is based on Coupled Analysis Dictionary Learning. Results on Biomedical signal reconstruction show that our proposed approach is very fast and yields result far better than CS and SSDAE.

CVMay 27, 2018

Hierarchical Representation Learning for Kinship Verification

Naman Kohli, Mayank Vatsa, Richa Singh et al.

Kinship verification has a number of applications such as organizing large collections of images and recognizing resemblances among humans. In this research, first, a human study is conducted to understand the capabilities of human mind and to identify the discriminatory areas of a face that facilitate kinship-cues. Utilizing the information obtained from the human study, a hierarchical Kinship Verification via Representation Learning (KVRL) framework is utilized to learn the representation of different face regions in an unsupervised manner. We propose a novel approach for feature representation termed as filtered contractive deep belief networks (fcDBN). The proposed feature representation encodes relational information present in images using filters and contractive regularization penalty. A compact representation of facial images of kin is extracted as an output from the learned model and a multi-layer neural network is utilized to verify the kin accurately. A new WVU Kinship Database is created which consists of multiple images per subject to facilitate kinship verification. The results show that the proposed deep learning framework (KVRL-fcDBN) yields stateof-the-art kinship verification accuracy on the WVU Kinship database and on four existing benchmark datasets. Further, kinship information is used as a soft biometric modality to boost the performance of face verification via product of likelihood ratio and support vector machine based approaches. Using the proposed KVRL-fcDBN framework, an improvement of over 20% is observed in the performance of face verification.

CVFeb 22, 2018

MagnifyMe: Aiding Cross Resolution Face Recognition via Identity Aware Synthesis

Maneet Singh, Shruti Nagpal, Richa Singh et al.

Enhancing low resolution images via super-resolution or image synthesis for cross-resolution face recognition has been well studied. Several image processing and machine learning paradigms have been explored for addressing the same. In this research, we propose Synthesis via Deep Sparse Representation algorithm for synthesizing a high resolution face image from a low resolution input image. The proposed algorithm learns multi-level sparse representation for both high and low resolution gallery images, along with an identity aware dictionary and a transformation function between the two representations for face identification scenarios. With low resolution test data as input, the high resolution test image is synthesized using the identity aware dictionary and transformation which is then used for face recognition. The performance of the proposed SDSR algorithm is evaluated on four databases, including one real world dataset. Experimental results and comparison with existing seven algorithms demonstrate the efficacy of the proposed algorithm in terms of both face identification and image quality measures.

IRJan 7, 2018

Indian Regional Movie Dataset for Recommender Systems

Prerna Agarwal, Richa Verma, Angshul Majumdar

Indian regional movie dataset is the first database of regional Indian movies, users and their ratings. It consists of movies belonging to 18 different Indian regional languages and metadata of users with varying demographics. Through this dataset, the diversity of Indian regional cinema and its huge viewership is captured. We analyze the dataset that contains roughly 10K ratings of 919 users and 2,851 movies using some supervised and unsupervised collaborative filtering techniques like Probabilistic Matrix Factorization, Matrix Completion, Blind Compressed Sensing etc. The dataset consists of metadata information of users like age, occupation, home state and known languages. It also consists of metadata of movies like genre, language, release year and cast. India has a wide base of viewers which is evident by the large number of movies released every year and the huge box-office revenue. This dataset can be used for designing recommendation systems for Indian users and regional movies, which do not, yet, exist. The dataset can be downloaded from \href{https://goo.gl/EmTPv6}{https://goo.gl/EmTPv6}.

CVOct 9, 2017

Face Sketch Matching via Coupled Deep Transform Learning

Shruti Nagpal, Maneet Singh, Richa Singh et al.

Face sketch to digital image matching is an important challenge of face recognition that involves matching across different domains. Current research efforts have primarily focused on extracting domain invariant representations or learning a mapping from one domain to the other. In this research, we propose a novel transform learning based approach termed as DeepTransformer, which learns a transformation and mapping function between the features of two domains. The proposed formulation is independent of the input information and can be applied with any existing learned or hand-crafted feature. Since the mapping function is directional in nature, we propose two variants of DeepTransformer: (i) semi-coupled and (ii) symmetrically-coupled deep transform learning. This research also uses a novel IIIT-D Composite Sketch with Age (CSA) variations database which contains sketch images of 150 subjects along with age-separated digital photos. The performance of the proposed models is evaluated on a novel application of sketch-to-sketch matching, along with sketch-to-digital photo matching. Experimental results demonstrate the robustness of the proposed models in comparison to existing state-of-the-art sketch matching algorithms and a commercial face recognition system.