Ravi Sundaram

7papers

54citations

Novelty56%

AI Score40

Ranked #95,695 of 205,806 authors (top 46%)#21,147 in LG (top 50%)

7 Papers

22.2CRMar 15

Generation of Human Comprehensible Access Control Policies from Audit Logs

Gautam Kumar, Ravi Sundaram, Shamik Sural

Over the years, access control systems have become increasingly more complex, often causing a disconnect between what is envisaged by the stakeholders in decision-making positions and the actual permissions granted as evidenced from access logs. For instance, Attribute-based Access Control (ABAC), which is a flexible yet complex model typically configured by system security officers, can be made understandable to others only when presented at a high level in natural language. Although several algorithms have been proposed in the literature for automatic extraction of ABAC rules from access logs, there is no attempt yet to bridge the semantic gap between the machine-enforceable formal logic and human-centric policy intent. Our work addresses this problem by developing a framework that generates human understandable natural language access control policies from logs. We investigate to what extent the power of Large Language Models (LLMs) can be harnessed to achieve both accuracy and scalability in the process. Named LANTERN (LLM-based ABAC Natural Translation and Explanation for Rule Navigation), we have instantiated the framework as a publicly accessible web based application for reproducibility of our results.

GTNov 4, 2023

Sample Complexity of Linear Regression Models for Opinion Formation in Networks

Haolin Liu, Rajmohan Rajaraman, Ravi Sundaram et al.

Consider public health officials aiming to spread awareness about a new vaccine in a community interconnected by a social network. How can they distribute information with minimal resources, so as to avoid polarization and ensure community-wide convergence of opinion? To tackle such challenges, we initiate the study of sample complexity of opinion convergence in networks. Our framework is built on the recognized opinion formation game, where we regard the opinion of each agent as a data-derived model, unlike previous works that treat opinions as data-independent scalars. The opinion model for every agent is initially learned from its local samples and evolves game-theoretically as all agents communicate with neighbors and revise their models towards an equilibrium. Our focus is on the sample complexity needed to ensure that the opinions converge to an equilibrium such that the final model of every agent has low generalization error. Our paper has two main technical results. First, we present a novel polynomial time optimization framework to quantify the total sample complexity for arbitrary networks, when the underlying learning problem is (generalized) linear regression. Second, we leverage this optimization to study the network gain which measures the improvement of sample complexity when learning over a network compared to that in isolation. Towards this end, we derive network gain bounds for various network classes including cliques, star graphs, and random regular graphs. Additionally, our framework provides a method to study sample distribution within the network, suggesting that it is sufficient to allocate samples inversely to the degree. Empirical results on both synthetic and real-world networks strongly support our theoretical findings.

LGDec 6, 2020

PAC-Learning for Strategic Classification

Ravi Sundaram, Anil Vullikanti, Haifeng Xu et al.

The study of strategic or adversarial manipulation of testing data to fool a classifier has attracted much recent attention. Most previous works have focused on two extreme situations where any testing data point either is completely adversarial or always equally prefers the positive label. In this paper, we generalize both of these through a unified framework for strategic classification, and introduce the notion of strategic VC-dimension (SVC) to capture the PAC-learnability in our general strategic setup. SVC provably generalizes the recent concept of adversarial VC-dimension (AVC) introduced by Cullina et al. arXiv:1806.01471. We instantiate our framework for the fundamental strategic linear classification problem. We fully characterize: (1) the statistical learnability of linear classifiers by pinning down its SVC; (2) its computational tractability by pinning down the complexity of the empirical risk minimization problem. Interestingly, the SVC of linear classifiers is always upper bounded by its standard VC-dimension. This characterization also strictly generalizes the AVC bound for linear classifiers in arXiv:1806.01471.

LGJun 11, 2020

Attention improves concentration when learning node embeddings

Matthew Dippel, Adam Kiezun, Tanay Mehta et al.

We consider the problem of predicting edges in a graph from node attributes in an e-commerce setting. Specifically, given nodes labelled with search query text, we want to predict links to related queries that share products. Experiments with a range of deep neural architectures show that simple feedforward networks with an attention mechanism perform best for learning embeddings. The simplicity of these models allows us to explain the performance of attention. We propose an analytically tractable model of query generation, AttEST, that views both products and the query text as vectors embedded in a latent space. We prove (and empirically validate) that the point-wise mutual information (PMI) matrix of the AttEST query text embeddings displays a low-rank behavior analogous to that observed in word embeddings. This low-rank property allows us to derive a loss function that maximizes the mutual information between related queries which is used to train an attention network to learn query embeddings. This AttEST network beats traditional memory-based LSTM architectures by over 20% on F-1 score. We justify this out-performance by showing that the weights from the attention mechanism correlate strongly with the weights of the best linear unbiased estimator (BLUE) for the product vectors, and conclude that attention plays an important role in variance reduction.

LGNov 12, 2017

Skyline Identification in Multi-Armed Bandits

Albert Cheu, Ravi Sundaram, Jonathan Ullman

We introduce a variant of the classical PAC multi-armed bandit problem. There is an ordered set of $n$ arms $A[1],\dots,A[n]$, each with some stochastic reward drawn from some unknown bounded distribution. The goal is to identify the $skyline$ of the set $A$, consisting of all arms $A[i]$ such that $A[i]$ has larger expected reward than all lower-numbered arms $A[1],\dots,A[i-1]$. We define a natural notion of an $\varepsilon$-approximate skyline and prove matching upper and lower bounds for identifying an $\varepsilon$-skyline. Specifically, we show that in order to identify an $\varepsilon$-skyline from among $n$ arms with probability $1-δ$, $$ Θ\bigg(\frac{n}{\varepsilon^2} \cdot \min\bigg\{ \log\bigg(\frac{1}{\varepsilon δ}\bigg), \log\bigg(\frac{n}δ\bigg) \bigg\} \bigg) $$ samples are necessary and sufficient. When $\varepsilon \gg 1/n$, our results improve over the naive algorithm, which draws enough samples to approximate the expected reward of every arm; the algorithm of (Auer et al., AISTATS'16) for Pareto-optimal arm identification is likewise superseded. Our results show that the sample complexity of the skyline problem lies strictly in between that of best arm identification (Even-Dar et al., COLT'02) and that of approximating the expected reward of every arm.

IRApr 13, 2016

Chiron: A Robust Recommendation System with Graph Regularizer

Saber Shokat Fadaee, Mohammad Sajjad Ghaemi, Ravi Sundaram et al.

Recommendation systems have been widely used by commercial service providers for giving suggestions to users. Collaborative filtering (CF) systems, one of the most popular recommendation systems, utilize the history of behaviors of the aggregate user-base to provide individual recommendations and are effective when almost all users faithfully express their opinions. However, they are vulnerable to malicious users biasing their inputs in order to change the overall ratings of a specific group of items. CF systems largely fall into two categories - neighborhood-based and (matrix) factorization-based - and the presence of adversarial input can influence recommendations in both categories, leading to instabilities in estimation and prediction. Although the robustness of different collaborative filtering algorithms has been extensively studied, designing an efficient system that is immune to manipulation remains a significant challenge. In this work we propose a novel "hybrid" recommendation system with an adaptive graph-based user/item similarity-regularization - "Chiron". Chiron ties the performance benefits of dimensionality reduction (through factorization) with the advantage of neighborhood clustering (through regularization). We demonstrate, using extensive comparative experiments, that Chiron is resistant to manipulation by large and lethal attacks.

CRMar 31, 2014

Secure and scalable match: overcoming the universal circuit bottleneck using group programs

Rajesh Krishnan, Ravi Sundaram

Confidential Content-Based Publish/Subscribe (C-CBPS) is an interaction (pub/sub) model that allows parties to exchange data while still protecting their security and privacy interests. In this paper we advance the state of the art in C-CBPS by showing how all predicate circuits in NC1 (logarithmic-depth, bounded fan-in) can be securely computed by a broker while guaranteeing perfect information-theoretic security. Previous work could handle only strictly shallower circuits (e.g. those with depth O(\sqrt{\lg n}) [SYY99, V76]. We present three protocols -- UGP-Match, FSGP-Match and OFSGP-Match -- all three are based on (2-decomposable randomized encodings of) group programs and handle circuits in NC1. UGP-Match is conceptually simple and has a clean proof of correctness but it is inefficient and impractical. FSGP-Match uses a "fixed structure" trick to achieve efficiency and scalability. And, finally, OFSGP-Match uses hand-optimized group programs to wring greater efficiencies. We complete our investigation with an experimental evaluation of a prototype implementation.