Jithin K. Sreedharan

2papers

2 Papers

SIMay 2, 2019
Temporal Ordered Clustering in Dynamic Networks: Unsupervised and Semi-supervised Learning Algorithms

Krzysztof Turowski, Jithin K. Sreedharan, Wojciech Szpankowski

In temporal ordered clustering, given a single snapshot of a dynamic network in which nodes arrive at distinct time instants, we aim at partitioning its nodes into $K$ ordered clusters $\mathcal{C}_1 \prec \cdots \prec \mathcal{C}_K$ such that for $i<j$, nodes in cluster $\mathcal{C}_i$ arrived before nodes in cluster $\mathcal{C}_j$, with $K$ being a data-driven parameter and not known upfront. Such a problem is of considerable significance in many applications ranging from tracking the expansion of fake news to mapping the spread of information. We first formulate our problem for a general dynamic graph, and propose an integer programming framework that finds the optimal clustering, represented as a strict partial order set, achieving the best precision (i.e., fraction of successfully ordered node pairs) for a fixed density (i.e., fraction of comparable node pairs). We then develop a sequential importance procedure and design unsupervised and semi-supervised algorithms to find temporal ordered clusters that efficiently approximate the optimal solution. To illustrate the techniques, we apply our methods to the vertex copying (duplication-divergence) model which exhibits some edge-case challenges in inferring the clusters as compared to other network models. Finally, we validate the performance of the proposed algorithms on synthetic and real-world networks.

SIOct 19, 2015
Bayesian Inference of Online Social Network Statistics via Lightweight Random Walk Crawls

Konstantin Avrachenkov, Bruno Ribeiro, Jithin K. Sreedharan

Online social networks (OSN) contain extensive amount of information about the underlying society that is yet to be explored. One of the most feasible technique to fetch information from OSN, crawling through Application Programming Interface (API) requests, poses serious concerns over the the guarantees of the estimates. In this work, we focus on making reliable statistical inference with limited API crawls. Based on regenerative properties of the random walks, we propose an unbiased estimator for the aggregated sum of functions over edges and proved the connection between variance of the estimator and spectral gap. In order to facilitate Bayesian inference on the true value of the estimator, we derive the approximate posterior distribution of the estimate. Later the proposed ideas are validated with numerical experiments on inference problems in real-world networks.