Yash Deshpande

LG
h-index6
13papers
1,309citations
Novelty58%
AI Score53

13 Papers

57.7NIMar 20Code
Vulnerability Analysis of eBPF-enabled Containerized Deployments of 5G Core Networks

Yash Deshpande, Samaresh Bera

The extended Berkeley Packet Filter (eBPF) is useful for faster packet processing and network monitoring in softwarized deployments. Similarly, softwarized deployments of 5G core network services adopted eBPF to meet the stringent latency and bandwidth requirements of underlying applications. While the existing studies focused on network performance, security concerns over eBPF-enabled platforms are overlooked. In this paper, we study the vulnerability analysis of 5G core network deployments that use eBPF for packet processing and traffic monitoring. In particular, we consider the following aspects: a) tracing, b) denial-of-service (DoS), c) stealing information, and d) bash injection. We present the detailed attack scenarios with step-by-step implementation of containerized and eBPF-enabled 5G network functions using Open5GS. The experiment results show that the aforementioned vulnerabilities are present in eBPF-enabled 5G deployments and can be exploited by attackers. Finally, we present some mitigation techniques useful for addressing the vulnerabilities. The source code and implementation details are made available at https://github.com/chimms1/5G-eBPF-exploits.

48.6NIApr 14
Improving Network Clock Synchronization by Marking Congestion

Yash Deshpande, Quirin Vogel, Laura Becker et al.

Achieving consistent time across devices in distributed systems often involves exchanging timestamped messages over a network. Precise time synchronization is crucial for applications such as cellular networks, industrial automation, and transactional databases. However, delay variation in synchronization packets-often caused by congestion from competing traffic-degrades synchronization accuracy. Detecting whether a packet experienced congestion can help improve synchronization through filtering and statistical methods. We propose an in-network congestion indication and filtering mechanism for synchronization messages used in protocols such as the Network Time Protocol (NTP) and Precision Time Protocol (PTP). Network devices mark packets that experienced queuing, allowing clocks to correct errors caused by varying delays. Our approach requires only simple changes at switches or routers, avoiding deep packet inspection or protocol modifications. The method is backward compatible, using standard but currently unused fields in IP, PTP, or NTP headers. We implement our method on a Tofino P4 target and demonstrate an improvement of over 80% in synchronization performance over a single hop. Moreover, we show that the performance of traditional statistical filters, such as min-RTT and median-delay, is improved by 90% over the one-hop hardware setup. We further demonstrate the effectiveness of our proposed method across multiple hops, both analytically and through simulation. Congestion marking improves the root-mean-squared clock offset estimation error by 30% to 80%, depending on network conditions and filtering techniques.

CLAug 20, 2020Code
VisualSem: A High-quality Knowledge Graph for Vision and Language

Houda Alberts, Teresa Huang, Yash Deshpande et al.

An exciting frontier in natural language understanding (NLU) and generation (NLG) calls for (vision-and-) language models that can efficiently access external structured knowledge repositories. However, many existing knowledge bases only cover limited domains, or suffer from noisy data, and most of all are typically hard to integrate into neural language pipelines. To fill this gap, we release VisualSem: a high-quality knowledge graph (KG) which includes nodes with multilingual glosses, multiple illustrative images, and visually relevant relations. We also release a neural multi-modal retrieval model that can use images or sentences as inputs and retrieves entities in the KG. This multi-modal retrieval model can be integrated into any (neural network) model pipeline. We encourage the research community to use VisualSem for data augmentation and/or as a source of grounding, among other possible uses. VisualSem as well as the multi-modal retrieval models are publicly available and can be downloaded in this URL: https://github.com/iacercalixto/visualsem

41.1LGMay 3
Pandora's Regret: A Proper Scoring Rule for Evaluating Sequential Search

Gerardo A. Flores, Yash Deshpande, Jannis R. Brea et al.

In sequential search, alternatives are tested until the true class is found. Standard proper scoring rules like log loss are local, ignoring the ranking of competitors and misaligning model evaluation with search utility. We show that sequential search induces a pairwise structure that overcomes this. By analyzing the expected cost of optimal search under varying testing costs, we derive Pandora's Regret: a closed-form, pairwise-additive, and strictly proper scoring rule. Pandora's Regret both elicits true probabilities and penalizes rank-reversing miscalibrations where distractors outrank the true class. Our construction yields a one-parameter Beta family that balances penalties for rank-swapping versus probability magnitude, while retaining a grounded interpretation as expected search cost. We prove that log loss, accuracy, and macro-F1 rely on implicit decision models misaligned with sequential search. Across 597 MedMNIST models, Pandora-based metrics better predict clinical diagnostic costs than standard alternatives, extending decision-theoretic scoring rule construction to the multiclass setting.

IVApr 13, 2025
Predicting ulcer in H&E images of inflammatory bowel disease using domain-knowledge-driven graph neural network

Ruiwen Ding, Lin Li, Rajath Soans et al.

Inflammatory bowel disease (IBD) involves chronic inflammation of the digestive tract, with treatment options often burdened by adverse effects. Identifying biomarkers for personalized treatment is crucial. While immune cells play a key role in IBD, accurately identifying ulcer regions in whole slide images (WSIs) is essential for characterizing these cells and exploring potential therapeutics. Multiple instance learning (MIL) approaches have advanced WSI analysis but they lack spatial context awareness. In this work, we propose a weakly-supervised model called DomainGCN that employs a graph convolution neural network (GCN) and incorporates domain-specific knowledge of ulcer features, specifically, the presence of epithelium, lymphocytes, and debris for WSI-level ulcer prediction in IBD. We demonstrate that DomainGCN outperforms various state-of-the-art (SOTA) MIL methods and show the added value of domain knowledge.

STJul 5, 2021
Near-optimal inference in adaptive linear regression

Koulik Khamaru, Yash Deshpande, Tor Lattimore et al.

When data is collected in an adaptive manner, even simple methods like ordinary least squares can exhibit non-normal asymptotic behavior. As an undesirable consequence, hypothesis tests and confidence intervals based on asymptotic normality can lead to erroneous results. We propose a family of online debiasing estimators to correct these distributional anomalies in least squares estimation. Our proposed methods take advantage of the covariance structure present in the dataset and provide sharper estimates in directions for which more information has accrued. We establish an asymptotic normality property for our proposed online debiasing estimators under mild conditions on the data collection process and provide asymptotically exact confidence intervals. We additionally prove a minimax lower bound for the adaptive linear regression problem, thereby providing a baseline by which to compare estimators. There are various conditions under which our proposed estimators achieve the minimax lower bound. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.

MENov 4, 2019
Online Debiasing for Adaptively Collected High-dimensional Data with Applications to Time Series Analysis

Yash Deshpande, Adel Javanmard, Mohammad Mehrabi

Adaptive collection of data is commonplace in applications throughout science and engineering. From the point of view of statistical inference however, adaptive data collection induces memory and correlation in the samples, and poses significant challenge. We consider the high-dimensional linear regression, where the samples are collected adaptively, and the sample size $n$ can be smaller than $p$, the number of covariates. In this setting, there are two distinct sources of bias: the first due to regularization imposed for consistent estimation, e.g. using the LASSO, and the second due to adaptivity in collecting the samples. We propose "online debiasing", a general procedure for estimators such as the LASSO, which addresses both sources of bias. In two concrete contexts $(i)$ time series analysis and $(ii)$ batched data collection, we demonstrate that online debiasing optimally debiases the LASSO estimate when the underlying parameter $θ_0$ has sparsity of order $o(\sqrt{n}/\log p)$. In this regime, the debiased estimator can be used to compute $p$-values and confidence intervals of optimal size.

SIJul 23, 2018
Contextual Stochastic Block Models

Yash Deshpande, Andrea Montanari, Elchanan Mossel et al.

We provide the first information theoretic tight analysis for inference of latent community structure given a sparse graph along with high dimensional node covariates, correlated with the same latent communities. Our work bridges recent theoretical breakthroughs in the detection of latent community structure without nodes covariates and a large body of empirical work using diverse heuristics for combining node covariates with graphs for inference. The tightness of our analysis implies in particular, the information theoretical necessity of combining the different sources of information. Our analysis holds for networks of large degrees as well as for a Gaussian version of the model.

MLDec 18, 2017
Accurate Inference for Adaptive Linear Models

Yash Deshpande, Lester Mackey, Vasilis Syrgkanis et al.

Estimators computed from adaptively collected data do not behave like their non-adaptive brethren. Rather, the sequential dependence of the collection policy can lead to severe distributional biases that persist even in the infinite data limit. We develop a general method -- $\mathbf{W}$-decorrelation -- for transforming the bias of adaptive linear regression estimators into variance. The method uses only coarse-grained information about the data collection policy and does not need access to propensity scores or exact knowledge of the policy. We bound the finite-sample bias and variance of the $\mathbf{W}$-estimator and develop asymptotically correct confidence intervals based on a novel martingale central limit theorem. We then demonstrate the empirical benefits of the generic $\mathbf{W}$-decorrelation procedure in two different adaptive data settings: the multi-armed bandit and the autoregressive time series.

MLSep 19, 2017
Inference in Graphical Models via Semidefinite Programming Hierarchies

Murat A. Erdogdu, Yash Deshpande, Andrea Montanari

Maximum A posteriori Probability (MAP) inference in graphical models amounts to solving a graph-structured combinatorial optimization problem. Popular inference algorithms such as belief propagation (BP) and generalized belief propagation (GBP) are intimately related to linear programming (LP) relaxation within the Sherali-Adams hierarchy. Despite the popularity of these algorithms, it is well understood that the Sum-of-Squares (SOS) hierarchy based on semidefinite programming (SDP) can provide superior guarantees. Unfortunately, SOS relaxations for a graph with $n$ vertices require solving an SDP with $n^{Θ(d)}$ variables where $d$ is the degree in the hierarchy. In practice, for $d\ge 4$, this approach does not scale beyond a few tens of variables. In this paper, we propose binary SDP relaxations for MAP inference using the SOS hierarchy with two innovations focused on computational efficiency. Firstly, in analogy to BP and its variants, we only introduce decision variables corresponding to contiguous regions in the graphical model. Secondly, we solve the resulting SDP using a non-convex Burer-Monteiro style method, and develop a sequential rounding procedure. We demonstrate that the resulting algorithm can solve problems with tens of thousands of variables within minutes, and outperforms BP and GBP on practical problems such as image denoising and Ising spin glasses. Finally, for specific graph types, we establish a sufficient condition for the tightness of the proposed partial SOS relaxation.

CCFeb 23, 2015
Improved Sum-of-Squares Lower Bounds for Hidden Clique and Hidden Submatrix Problems

Yash Deshpande, Andrea Montanari

Given a large data matrix $A\in\mathbb{R}^{n\times n}$, we consider the problem of determining whether its entries are i.i.d. with some known marginal distribution $A_{ij}\sim P_0$, or instead $A$ contains a principal submatrix $A_{{\sf Q},{\sf Q}}$ whose entries have marginal distribution $A_{ij}\sim P_1\neq P_0$. As a special case, the hidden (or planted) clique problem requires to find a planted clique in an otherwise uniformly random graph. Assuming unbounded computational resources, this hypothesis testing problem is statistically solvable provided $|{\sf Q}|\ge C \log n$ for a suitable constant $C$. However, despite substantial effort, no polynomial time algorithm is known that succeeds with high probability when $|{\sf Q}| = o(\sqrt{n})$. Recently Meka and Wigderson \cite{meka2013association}, proposed a method to establish lower bounds within the Sum of Squares (SOS) semidefinite hierarchy. Here we consider the degree-$4$ SOS relaxation, and study the construction of \cite{meka2013association} to prove that SOS fails unless $k\ge C\, n^{1/3}/\log n$. An argument presented by Barak implies that this lower bound cannot be substantially improved unless the witness construction is changed in the proof. Our proof uses the moments method to bound the spectrum of a certain random association scheme, i.e. a symmetric random matrix whose rows and columns are indexed by the edges of an Erdös-Renyi random graph.

STNov 20, 2013
Sparse PCA via Covariance Thresholding

Yash Deshpande, Andrea Montanari

In sparse principal component analysis we are given noisy observations of a low-rank matrix of dimension $n\times p$ and seek to reconstruct it under additional sparsity assumptions. In particular, we assume here each of the principal components $\mathbf{v}_1,\dots,\mathbf{v}_r$ has at most $s_0$ non-zero entries. We are particularly interested in the high dimensional regime wherein $p$ is comparable to, or even much larger than $n$. In an influential paper, \cite{johnstone2004sparse} introduced a simple algorithm that estimates the support of the principal vectors $\mathbf{v}_1,\dots,\mathbf{v}_r$ by the largest entries in the diagonal of the empirical covariance. This method can be shown to identify the correct support with high probability if $s_0\le K_1\sqrt{n/\log p}$, and to fail with high probability if $s_0\ge K_2 \sqrt{n/\log p}$ for two constants $0<K_1,K_2<\infty$. Despite a considerable amount of work over the last ten years, no practical algorithm exists with provably better support recovery guarantees. Here we analyze a covariance thresholding algorithm that was recently proposed by \cite{KrauthgamerSPCA}. On the basis of numerical simulations (for the rank-one case), these authors conjectured that covariance thresholding correctly recover the support with high probability for $s_0\le K\sqrt{n}$ (assuming $n$ of the same order as $p$). We prove this conjecture, and in fact establish a more general guarantee including higher-rank as well as $n$ much smaller than $p$. Recent lower bounds \cite{berthet2013computational, ma2015sum} suggest that no polynomial time algorithm can do significantly better. The key technical component of our analysis develops new bounds on the norm of kernel random matrices, in regimes that were not considered before.

LGJan 8, 2013
Linear Bandits in High Dimension and Recommendation Systems

Yash Deshpande, Andrea Montanari

A large number of online services provide automated recommendations to help users to navigate through a large collection of items. New items (products, videos, songs, advertisements) are suggested on the basis of the user's past history and --when available-- her demographic profile. Recommendations have to satisfy the dual goal of helping the user to explore the space of available items, while allowing the system to probe the user's preferences. We model this trade-off using linearly parametrized multi-armed bandits, propose a policy and prove upper and lower bounds on the cumulative "reward" that coincide up to constants in the data poor (high-dimensional) regime. Prior work on linear bandits has focused on the data rich (low-dimensional) regime and used cumulative "risk" as the figure of merit. For this data rich regime, we provide a simple modification for our policy that achieves near-optimal risk performance under more restrictive assumptions on the geometry of the problem. We test (a variation of) the scheme used for establishing achievability on the Netflix and MovieLens datasets and obtain good agreement with the qualitative predictions of the theory we develop.