Chiranjib Bhattacharyya

LG
h-index44
21papers
2,425citations
Novelty60%
AI Score50

21 Papers

AIDec 15, 2022Code
BNSynth: Bounded Boolean Functional Synthesis

Ravi Raja, Stanly Samuel, Chiranjib Bhattacharyya et al.

The automated synthesis of correct-by-construction Boolean functions from logical specifications is known as the Boolean Functional Synthesis (BFS) problem. BFS has many application areas that range from software engineering to circuit design. In this paper, we introduce a tool BNSynth, that is the first to solve the BFS problem under a given bound on the solution space. Bounding the solution space induces the synthesis of smaller functions that benefit resource constrained areas such as circuit design. BNSynth uses a counter-example guided, neural approach to solve the bounded BFS problem. Initial results show promise in synthesizing smaller solutions; we observe at least \textbf{3.2X} (and up to \textbf{24X}) improvement in the reduction of solution size on average, as compared to state of the art tools on our benchmarks. BNSynth is available on GitHub under an open source license.

LGJul 21, 2023
Random Separating Hyperplane Theorem and Learning Polytopes

Chiranjib Bhattacharyya, Ravindran Kannan, Amit Kumar

The Separating Hyperplane theorem is a fundamental result in Convex Geometry with myriad applications. Our first result, Random Separating Hyperplane Theorem (RSH), is a strengthening of this for polytopes. $\rsh$ asserts that if the distance between $a$ and a polytope $K$ with $k$ vertices and unit diameter in $\Re^d$ is at least $δ$, where $δ$ is a fixed constant in $(0,1)$, then a randomly chosen hyperplane separates $a$ and $K$ with probability at least $1/poly(k)$ and margin at least $Ω\left(δ/\sqrt{d} \right)$. An immediate consequence of our result is the first near optimal bound on the error increase in the reduction from a Separation oracle to an Optimization oracle over a polytope. RSH has algorithmic applications in learning polytopes. We consider a fundamental problem, denoted the ``Hausdorff problem'', of learning a unit diameter polytope $K$ within Hausdorff distance $δ$, given an optimization oracle for $K$. Using RSH, we show that with polynomially many random queries to the optimization oracle, $K$ can be approximated within error $O(δ)$. To our knowledge this is the first provable algorithm for the Hausdorff Problem. Building on this result, we show that if the vertices of $K$ are well-separated, then an optimization oracle can be used to generate a list of points, each within Hausdorff distance $O(δ)$ of $K$, with the property that the list contains a point close to each vertex of $K$. Further, we show how to prune this list to generate a (unique) approximation to each vertex of the polytope. We prove that in many latent variable settings, e.g., topic modeling, LDA, optimization oracles do exist provided we project to a suitable SVD subspace. Thus, our work yields the first efficient algorithm for finding approximations to the vertices of the latent polytope under the well-separatedness assumption.

QUANT-PHOct 13, 2022
Shot-frugal and Robust quantum kernel classifiers

Abhay Shastry, Abhijith Jayakumar, Apoorva Patel et al.

Quantum kernel methods are a candidate for quantum speed-ups in supervised machine learning. The number of quantum measurements N required for a reasonable kernel estimate is a critical resource, both from complexity considerations and because of the constraints of near-term quantum hardware. We emphasize that for classification tasks, the aim is reliable classification and not precise kernel evaluation, and demonstrate that the former is far more resource efficient. Furthermore, it is shown that the accuracy of classification is not a suitable performance metric in the presence of noise and we motivate a new metric that characterizes the reliability of classification. We then obtain a bound for N which ensures, with high probability, that classification errors over a dataset are bounded by the margin errors of an idealized quantum kernel classifier. Using chance constraint programming and the subgaussian bounds of quantum kernel distributions, we derive several Shot-frugal and Robust (ShofaR) programs starting from the primal formulation of the Support Vector Machine. This significantly reduces the number of quantum measurements needed and is robust to noise by construction. Our strategy is applicable to uncertainty in quantum kernels arising from any source of unbiased noise.

CLDec 11, 2018Code
RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information

Shikhar Vashishth, Rishabh Joshi, Sai Suman Prayaga et al.

Distantly-supervised Relation Extraction (RE) methods train an extractor by automatically aligning relation instances in a Knowledge Base (KB) with unstructured text. In addition to relation instances, KBs often contain other relevant side information, such as aliases of relations (e.g., founded and co-founded are aliases for the relation founderOfCompany). RE models usually ignore such readily available side information. In this paper, we propose RESIDE, a distantly-supervised neural relation extraction method which utilizes additional side information from KBs for improved relation extraction. It uses entity type and relation alias information for imposing soft constraints while predicting relations. RESIDE employs Graph Convolution Networks (GCN) to encode syntactic information from text and improves performance even when limited side information is available. Through extensive experiments on benchmark datasets, we demonstrate RESIDE's effectiveness. We have made RESIDE's source code available to encourage reproducible research.

LGJul 30, 2025
DO-EM: Density Operator Expectation Maximization

Adit Vishnu, Abhay Shastry, Dhruva Kashyap et al.

Density operators, quantum generalizations of probability distributions, are gaining prominence in machine learning due to their foundational role in quantum computing. Generative modeling based on density operator models (\textbf{DOMs}) is an emerging field, but existing training algorithms -- such as those for the Quantum Boltzmann Machine -- do not scale to real-world data, such as the MNIST dataset. The Expectation-Maximization algorithm has played a fundamental role in enabling scalable training of probabilistic latent variable models on real-world datasets. \textit{In this paper, we develop an Expectation-Maximization framework to learn latent variable models defined through \textbf{DOMs} on classical hardware, with resources comparable to those used for probabilistic models, while scaling to real-world data.} However, designing such an algorithm is nontrivial due to the absence of a well-defined quantum analogue to conditional probability, which complicates the Expectation step. To overcome this, we reformulate the Expectation step as a quantum information projection (QIP) problem and show that the Petz Recovery Map provides a solution under sufficient conditions. Using this formulation, we introduce the Density Operator Expectation Maximization (DO-EM) algorithm -- an iterative Minorant-Maximization procedure that optimizes a quantum evidence lower bound. We show that the \textbf{DO-EM} algorithm ensures non-decreasing log-likelihood across iterations for a broad class of models. Finally, we present Quantum Interleaved Deep Boltzmann Machines (\textbf{QiDBMs}), a \textbf{DOM} that can be trained with the same resources as a DBM. When trained with \textbf{DO-EM} under Contrastive Divergence, a \textbf{QiDBM} outperforms larger classical DBMs in image generation on the MNIST dataset, achieving a 40--60\% reduction in the Fréchet Inception Distance.

LGNov 24, 2025
ModHiFi: Identifying High Fidelity predictive components for Model Modification

Dhruva Kashyap, Chaitanya Murti, Pranav K Nayak et al.

Open weight models, which are ubiquitous, rarely provide access to their training data or loss function. This makes modifying such models for tasks such as pruning or unlearning, which are constrained by this unavailability, an active area of research. Existing techniques typically require gradients or ground-truth labels, rendering them infeasible in settings with limited computational resources. In this work, we investigate the fundamental question of identifying components that are critical to the model's predictive performance, without access to either gradients or the loss function, and with only distributional access such as synthetic data. We theoretically demonstrate that the global error is linearly bounded by local reconstruction errors for Lipschitz-continuous networks such as CNNs and well-trained Transformers (which, contrary to existing literature, we find exhibit Lipschitz continuity). This motivates using the locally reconstructive behavior of component subsets to quantify their global importance, via a metric that we term Subset Fidelity. In the uncorrelated features setting, selecting individual components based on their Subset Fidelity scores is optimal, which we utilize to propose ModHiFi, an algorithm for model modification that requires neither training data nor access to a loss function. ModHiFi-P, for structured pruning, achieves an 11\% speedup over the current state of the art on ImageNet models and competitive performance on language models. ModHiFi-U, for classwise unlearning, achieves complete unlearning on CIFAR-10 without fine-tuning and demonstrates competitive performance on Swin Transformers.

LGSep 19, 2025
On Optimal Steering to Achieve Exact Fairness

Mohit Sharma, Amit Jayant Deshpande, Chiranjib Bhattacharyya et al.

To fix the 'bias in, bias out' problem in fair machine learning, it is important to steer feature distributions of data or internal representations of Large Language Models (LLMs) to ideal ones that guarantee group-fair outcomes. Previous work on fair generative models and representation steering could greatly benefit from provable fairness guarantees on the model output. We define a distribution as ideal if the minimizer of any cost-sensitive risk on it is guaranteed to have exact group-fair outcomes (e.g., demographic parity, equal opportunity)-in other words, it has no fairness-utility trade-off. We formulate an optimization program for optimal steering by finding the nearest ideal distribution in KL-divergence, and provide efficient algorithms for it when the underlying distributions come from well-known parametric families (e.g., normal, log-normal). Empirically, our optimal steering techniques on both synthetic and real-world datasets improve fairness without diminishing utility (and sometimes even improve utility). We demonstrate affine steering of LLM representations to reduce bias in multi-class classification, e.g., occupation prediction from a short biography in Bios dataset (De-Arteaga et al.). Furthermore, we steer internal representations of LLMs towards desired outputs so that it works equally well across different groups.

LGMay 31, 2021
Rawlsian Fair Adaptation of Deep Learning Classifiers

Kulin Shah, Pooja Gupta, Amit Deshpande et al.

Group-fairness in classification aims for equality of a predictive utility across different sensitive sub-populations, e.g., race or gender. Equality or near-equality constraints in group-fairness often worsen not only the aggregate utility but also the utility for the least advantaged sub-population. In this paper, we apply the principles of Pareto-efficiency and least-difference to the utility being accuracy, as an illustrative example, and arrive at the Rawls classifier that minimizes the error rate on the worst-off sensitive sub-population. Our mathematical characterization shows that the Rawls classifier uniformly applies a threshold to an ideal score of features, in the spirit of fair equality of opportunity. In practice, such a score or a feature representation is often computed by a black-box model that has been useful but unfair. Our second contribution is practical Rawlsian fair adaptation of any given black-box deep learning model, without changing the score or feature representation it computes. Given any score function or feature representation and only its second-order statistics on the sensitive sub-populations, we seek a threshold classifier on the given score or a linear threshold classifier on the given feature representation that achieves the Rawls error rate restricted to this hypothesis class. Our technical contribution is to formulate the above problems using ambiguous chance constraints, and to provide efficient algorithms for Rawlsian fair adaptation, along with provable upper bounds on the Rawls error rate. Our empirical results show significant improvement over state-of-the-art group-fair algorithms, even without retraining for fairness.

CVMay 26, 2021
DSLR: Dynamic to Static LiDAR Scan Reconstruction Using Adversarially Trained Autoencoder

Prashant Kumar, Sabyasachi Sahoo, Vanshil Shah et al.

Accurate reconstruction of static environments from LiDAR scans of scenes containing dynamic objects, which we refer to as Dynamic to Static Translation (DST), is an important area of research in Autonomous Navigation. This problem has been recently explored for visual SLAM, but to the best of our knowledge no work has been attempted to address DST for LiDAR scans. The problem is of critical importance due to wide-spread adoption of LiDAR in Autonomous Vehicles. We show that state-of the art methods developed for the visual domain when adapted for LiDAR scans perform poorly. We develop DSLR, a deep generative model which learns a mapping between dynamic scan to its static counterpart through an adversarially trained autoencoder. Our model yields the first solution for DST on LiDAR that generates static scans without using explicit segmentation labels. DSLR cannot always be applied to real world data due to lack of paired dynamic-static scans. Using Unsupervised Domain Adaptation, we propose DSLR-UDA for transfer to real world data and experimentally show that this performs well in real world settings. Additionally, if segmentation information is available, we extend DSLR to DSLR-Seg to further improve the reconstruction quality. DSLR gives the state of the art performance on simulated and real-world datasets and also shows at least 4x improvement. We show that DSLR, unlike the existing baselines, is a practically viable model with its reconstruction quality within the tolerable limits for tasks pertaining to autonomous navigation like SLAM in dynamic environments.

LGMay 17, 2021
Learning a Latent Simplex in Input-Sparsity Time

Ainesh Bakshi, Chiranjib Bhattacharyya, Ravi Kannan et al.

We consider the problem of learning a latent $k$-vertex simplex $K\subset\mathbb{R}^d$, given access to $A\in\mathbb{R}^{d\times n}$, which can be viewed as a data matrix with $n$ points that are obtained by randomly perturbing latent points in the simplex $K$ (potentially beyond $K$). A large class of latent variable models, such as adversarial clustering, mixed membership stochastic block models, and topic models can be cast as learning a latent simplex. Bhattacharyya and Kannan (SODA, 2020) give an algorithm for learning such a latent simplex in time roughly $O(k\cdot\textrm{nnz}(A))$, where $\textrm{nnz}(A)$ is the number of non-zeros in $A$. We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$, which holds in many of these applications. Further, we show this assumption is necessary, as otherwise an algorithm for learning a latent simplex would imply an algorithmic breakthrough for spectral low rank approximation. At a high level, Bhattacharyya and Kannan provide an adaptive algorithm that makes $k$ matrix-vector product queries to $A$ and each query is a function of all queries preceding it. Since each matrix-vector product requires $\textrm{nnz}(A)$ time, their overall running time appears unavoidable. Instead, we obtain a low-rank approximation to $A$ in input-sparsity time and show that the column space thus obtained has small $\sinΘ$ (angular) distance to the right top-$k$ singular space of $A$. Our algorithm then selects $k$ points in the low-rank subspace with the largest inner product with $k$ carefully chosen random vectors. By working in the low-rank subspace, we avoid reading the entire matrix in each iteration and thus circumvent the $Θ(k\cdot\textrm{nnz}(A))$ running time.

DSDec 8, 2020
Algorithms for finding $k$ in $k$-means

Chiranjib Bhattacharyya, Ravindran Kannan, Amit Kumar

$k-$means Clustering requires as input the exact value of $k$, the number of clusters. Two challenges are open: (i) Is there a data-determined definition of $k$ which is provably correct and (ii) Is there a polynomial time algorithm to find $k$ from data ? This paper provides the first affirmative answers to both these questions. As common in the literature, we assume that the data admits an unknown Ground Truth (GT) clustering with cluster centers separated. This assumption alone is not sufficient to answer Yes to (i). We assume a novel, but natural second constraint called no tight sub-cluster (NTSC) which stipulates that no substantially large subset of a GT cluster can be "tighter" (in a sense we define) than the cluster. Our yes answer to (i) and (ii) are under these two deterministic assumptions. We also give polynomial time algorithm to identify $k$. Our algorithm relies on NTSC to peel off one cluster at a time by identifying points which are tightly packed. We are also able to show that our algorithm(s) apply to data generated by mixtures of Gaussians and more generally to mixtures of sub-Gaussian pdf's and hence are able to find the number of components of the mixture from data. To our knowledge, previous results for these specialized settings as well, assume generally that $k$ is given besides the data.

LGMar 30, 2020
Analysis of Knowledge Transfer in Kernel Regime

Arman Rahbar, Ashkan Panahi, Chiranjib Bhattacharyya et al.

Knowledge transfer is shown to be a very successful technique for training neural classifiers: together with the ground truth data, it uses the "privileged information" (PI) obtained by a "teacher" network to train a "student" network. It has been observed that classifiers learn much faster and more reliably via knowledge transfer. However, there has been little or no theoretical analysis of this phenomenon. To bridge this gap, we propose to approach the problem of knowledge transfer by regularizing the fit between the teacher and the student with PI provided by the teacher. Using tools from dynamical systems theory, we show that when the student is an extremely wide two layer network, we can analyze it in the kernel regime and show that it is able to interpolate between PI and the given data. This characterization sheds new light on the relation between the training error and capacity of the student relative to the teacher. Another contribution of the paper is a quantitative statement on the convergence of student network. We prove that the teacher reduces the number of required iterations for a student to learn, and consequently improves the generalization power of the student. We give corresponding experimental analysis that validates the theoretical results and yield additional insights.

LGApr 14, 2019
Finding a latent k-simplex in O(k . nnz(data)) time via Subset Smoothing

Chiranjib Bhattacharyya, Ravindran Kannan

In this paper we show that a large class of Latent variable models, such as Mixed Membership Stochastic Block(MMSB) Models, Topic Models, and Adversarial Clustering, can be unified through a geometric perspective, replacing model specific assumptions and algorithms for individual models. The geometric perspective leads to the formulation: \emph{find a latent $k-$ polytope $K$ in ${\bf R}^d$ given $n$ data points, each obtained by perturbing a latent point in $K$}. This problem does not seem to have been considered in the literature. The most important contribution of this paper is to show that the latent $k-$polytope problem admits an efficient algorithm under deterministic assumptions which naturally hold in Latent variable models considered in this paper. ur algorithm runs in time $O^*(k\; \mbox{nnz})$ matching the best running time of algorithms in special cases considered here and is better when the data is sparse, as is the case in applications. An important novelty of the algorithm is the introduction of \emph{subset smoothed polytope}, $K'$, the convex hull of the ${n\choose δn}$ points obtained by averaging all $δn$ subsets of the data points, for a given $δ\in (0,1)$. We show that $K$ and $K'$ are close in Hausdroff distance. Among the consequences of our algorithm are the following: (a) MMSB Models and Topic Models: the first quasi-input-sparsity time algorithm for parameter estimation for $k \in O^*(1)$, (b) Adversarial Clustering: In $k-$means, if, an adversary is allowed to move many data points from each cluster an arbitrary amount towards the convex hull of the centers of other clusters, our algorithm still estimates cluster centers well.

LGNov 6, 2018
How Many Pairwise Preferences Do We Need to Rank A Graph Consistently?

Aadirupa Saha, Rakesh Shivanna, Chiranjib Bhattacharyya

We consider the problem of optimal recovery of true ranking of $n$ items from a randomly chosen subset of their pairwise preferences. It is well known that without any further assumption, one requires a sample size of $Ω(n^2)$ for the purpose. We analyze the problem with an additional structure of relational graph $G([n],E)$ over the $n$ items added with an assumption of \emph{locality}: Neighboring items are similar in their rankings. Noting the preferential nature of the data, we choose to embed not the graph, but, its \emph{strong product} to capture the pairwise node relationships. Furthermore, unlike existing literature that uses Laplacian embedding for graph based learning problems, we use a richer class of graph embeddings---\emph{orthonormal representations}---that includes (normalized) Laplacian as its special case. Our proposed algorithm, {\it Pref-Rank}, predicts the underlying ranking using an SVM based approach over the chosen embedding of the product graph, and is the first to provide \emph{statistical consistency} on two ranking losses: \emph{Kendall's tau} and \emph{Spearman's footrule}, with a required sample complexity of $O(n^2 χ(\bar{G}))^{\frac{2}{3}}$ pairs, $χ(\bar{G})$ being the \emph{chromatic number} of the complement graph $\bar{G}$. Clearly, our sample complexity is smaller for dense graphs, with $χ(\bar G)$ characterizing the degree of node connectivity, which is also intuitive due to the locality assumption e.g. $O(n^\frac{4}{3})$ for union of $k$-cliques, or $O(n^\frac{5}{3})$ for random and power law graphs etc.---a quantity much smaller than the fundamental limit of $Ω(n^2)$ for large $n$. This, for the first time, relates ranking complexity to structural properties of the graph. We also report experimental evaluations on different synthetic and real datasets, where our algorithm is shown to outperform the state-of-the-art methods.

CLSep 12, 2018
Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks

Shikhar Vashishth, Manik Bhandari, Prateek Yadav et al.

Word embeddings have been widely adopted across several NLP applications. Most existing word embedding methods utilize sequential context of a word to learn its embedding. While there have been some attempts at utilizing syntactic context of a word, such methods result in an explosion of the vocabulary size. In this paper, we overcome this problem by proposing SynGCN, a flexible Graph Convolution based method for learning word embeddings. SynGCN utilizes the dependency context of a word without increasing the vocabulary size. Word embeddings learned by SynGCN outperform existing methods on various intrinsic and extrinsic tasks and provide an advantage when used with ELMo. We also propose SemGCN, an effective framework for incorporating diverse semantic knowledge for further enhancing learned word representations. We make the source code of both models available to encourage reproducible research.

MLJun 12, 2018
Using Inherent Structures to design Lean 2-layer RBMs

Abhishek Bansal, Abhinav Anand, Chiranjib Bhattacharyya

Understanding the representational power of Restricted Boltzmann Machines (RBMs) with multiple layers is an ill-understood problem and is an area of active research. Motivated from the approach of \emph{Inherent Structure formalism} (Stillinger & Weber, 1982), extensively used in analysing Spin Glasses, we propose a novel measure called \emph{Inherent Structure Capacity} (ISC), which characterizes the representation capacity of a fixed architecture RBM by the expected number of modes of distributions emanating from the RBM with parameters drawn from a prior distribution. Though ISC is intractable, we show that for a single layer RBM architecture ISC approaches a finite constant as number of hidden units are increased and to further improve the ISC, one needs to add a second layer. Furthermore, we introduce \emph{Lean} RBMs, which are multi-layer RBMs where each layer can have at-most $O(n)$ units with the number of visible units being n. We show that for every single layer RBM with $Ω(n^{2+r}), r \ge 0$, hidden units there exists a two-layered \emph{lean} RBM with $Θ(n^2)$ parameters with the same ISC, establishing that 2 layer RBMs can achieve the same representational power as single-layer RBMs but using far fewer number of parameters. To the best of our knowledge, this is the first result which quantitatively establishes the need for layering.

MLOct 26, 2014
A provable SVD-based algorithm for learning topics in dominant admixture corpus

Trapit Bansal, Chiranjib Bhattacharyya, Ravindran Kannan

Topic models, such as Latent Dirichlet Allocation (LDA), posit that documents are drawn from admixtures of distributions over words, known as topics. The inference problem of recovering topics from admixtures, is NP-hard. Assuming separability, a strong assumption, [4] gave the first provable algorithm for inference. For LDA model, [6] gave a provable algorithm using tensor-methods. But [4,6] do not learn topic vectors with bounded $l_1$ error (a natural measure for probability vectors). Our aim is to develop a model which makes intuitive and empirically supported assumptions and to design an algorithm with natural, simple components such as SVD, which provably solves the inference problem for the model with bounded $l_1$ error. A topic in LDA and other models is essentially characterized by a group of co-occurring words. Motivated by this, we introduce topic specific Catchwords, group of words which occur with strictly greater frequency in a topic than any other topic individually and are required to have high frequency together rather than individually. A major contribution of the paper is to show that under this more realistic assumption, which is empirically verified on real corpora, a singular value decomposition (SVD) based algorithm with a crucial pre-processing step of thresholding, can provably recover the topics from a collection of documents drawn from Dominant admixtures. Dominant admixtures are convex combination of distributions in which one distribution has a significantly higher contribution than others. Apart from the simplicity of the algorithm, the sample complexity has near optimal dependence on $w_0$, the lowest probability that a topic is dominant, and is better than [4]. Empirical evidence shows that on several real world corpora, both Catchwords and Dominant admixture assumptions hold and the proposed algorithm substantially outperforms the state of the art [5].

OSOct 13, 2014
Mining Block I/O Traces for Cache Preloading with Sparse Temporal Non-parametric Mixture of Multivariate Poisson

Lavanya Sita Tekumalla, Chiranjib Bhattacharyya

Existing caching strategies, in the storage domain, though well suited to exploit short range spatio-temporal patterns, are unable to leverage long-range motifs for improving hitrates. Motivated by this, we investigate novel Bayesian non-parametric modeling(BNP) techniques for count vectors, to capture long range correlations for cache preloading, by mining Block I/O traces. Such traces comprise of a sequence of memory accesses that can be aggregated into high-dimensional sparse correlated count vector sequences. While there are several state of the art BNP algorithms for clustering and their temporal extensions for prediction, there has been no work on exploring these for correlated count vectors. Our first contribution addresses this gap by proposing a DP based mixture model of Multivariate Poisson (DP-MMVP) and its temporal extension(HMM-DP-MMVP) that captures the full covariance structure of multivariate count data. However, modeling full covariance structure for count vectors is computationally expensive, particularly for high dimensional data. Hence, we exploit sparsity in our count vectors, and as our main contribution, introduce the Sparse DP mixture of multivariate Poisson(Sparse-DP-MMVP), generalizing our DP-MMVP mixture model, also leading to more efficient inference. We then discuss a temporal extension to our model for cache preloading. We take the first step towards mining historical data, to capture long range patterns in storage traces for cache preloading. Experimentally, we show a dramatic improvement in hitrates on benchmark traces and lay the groundwork for further research in storage domain to reduce latencies using data mining techniques to capture long range motifs.

CVSep 22, 2014
Temporally Coherent Bayesian Models for Entity Discovery in Videos by Tracklet Clustering

Adway Mitra, Soma Biswas, Chiranjib Bhattacharyya

A video can be represented as a sequence of tracklets, each spanning 10-20 frames, and associated with one entity (eg. a person). The task of \emph{Entity Discovery} in videos can be naturally posed as tracklet clustering. We approach this task by leveraging \emph{Temporal Coherence}(TC): the fundamental property of videos that each tracklet is likely to be associated with the same entity as its temporal neighbors. Our major contributions are the first Bayesian nonparametric models for TC at tracklet-level. We extend Chinese Restaurant Process (CRP) to propose TC-CRP, and further to Temporally Coherent Chinese Restaurant Franchise (TC-CRF) to jointly model short temporal segments. On the task of discovering persons in TV serial videos without meta-data like scripts, these methods show considerable improvement in cluster purity and person coverage compared to state-of-the-art approaches to tracklet clustering. We represent entities with mixture components, and tracklets with vectors of very generic features, which can work for any type of entity (not necessarily person). The proposed methods can perform online tracklet clustering on streaming videos with little performance deterioration unlike existing approaches, and can automatically reject tracklets resulting from false detections. Finally we discuss entity-driven video summarization- where some temporal segments of the video are selected automatically based on the discovered entities.

LGDec 31, 2013
Controlled Sparsity Kernel Learning

Dinesh Govindaraj, Raman Sankaran, Sreedal Menon et al.

Multiple Kernel Learning(MKL) on Support Vector Machines(SVMs) has been a popular front of research in recent times due to its success in application problems like Object Categorization. This success is due to the fact that MKL has the ability to choose from a variety of feature kernels to identify the optimal kernel combination. But the initial formulation of MKL was only able to select the best of the features and misses out many other informative kernels presented. To overcome this, the Lp norm based formulation was proposed by Kloft et. al. This formulation is capable of choosing a non-sparse set of kernels through a control parameter p. Unfortunately, the parameter p does not have a direct meaning to the number of kernels selected. We have observed that stricter control over the number of kernels selected gives us an edge over these techniques in terms of accuracy of classification and also helps us to fine tune the algorithms to the time requirements at hand. In this work, we propose a Controlled Sparsity Kernel Learning (CSKL) formulation that can strictly control the number of kernels which we wish to select. The CSKL formulation introduces a parameter t which directly corresponds to the number of kernels selected. It is important to note that a search in t space is finite and fast as compared to p. We have also provided an efficient Reduced Gradient Descent based algorithm to solve the CSKL formulation, which is proven to converge. Through our experiments on the Caltech101 Object Categorization dataset, we have also shown that one can achieve better accuracies than the previous formulations through the right choice of t.

SIMay 7, 2012
Dynamic Multi-Relational Chinese Restaurant Process for Analyzing Influences on Users in Social Media

Himabindu Lakkaraju, Indrajit Bhattacharya, Chiranjib Bhattacharyya

We study the problem of analyzing influence of various factors affecting individual messages posted in social media. The problem is challenging because of various types of influences propagating through the social media network that act simultaneously on any user. Additionally, the topic composition of the influencing factors and the susceptibility of users to these influences evolve over time. This problem has not studied before, and off-the-shelf models are unsuitable for this purpose. To capture the complex interplay of these various factors, we propose a new non-parametric model called the Dynamic Multi-Relational Chinese Restaurant Process. This accounts for the user network for data generation and also allows the parameters to evolve over time. Designing inference algorithms for this model suited for large scale social-media data is another challenge. To this end, we propose a scalable and multi-threaded inference algorithm based on online Gibbs Sampling. Extensive evaluations on large-scale Twitter and Facebook data show that the extracted topics when applied to authorship and commenting prediction outperform state-of-the-art baselines. More importantly, our model produces valuable insights on topic trends and user personality trends, beyond the capability of existing approaches.