Rohit Singh

LG
h-index27
16papers
1,455citations
Novelty44%
AI Score49

16 Papers

LGOct 18, 2022
Granger causal inference on DAGs identifies genomic loci regulating transcription

Rohit Singh, Alexander P. Wu, Bonnie Berger

When a dynamical system can be modeled as a sequence of observations, Granger causality is a powerful approach for detecting predictive interactions between its variables. However, traditional Granger causal inference has limited utility in domains where the dynamics need to be represented as directed acyclic graphs (DAGs) rather than as a linear sequence, such as with cell differentiation trajectories. Here, we present GrID-Net, a framework based on graph neural networks with lagged message passing for Granger causal inference on DAG-structured systems. Our motivating application is the analysis of single-cell multimodal data to identify genomic loci that mediate the regulation of specific genes. To our knowledge, GrID-Net is the first single-cell analysis tool that accounts for the temporal lag between a genomic locus becoming accessible and its downstream effect on a target gene's expression. We applied GrID-Net on multimodal single-cell assays that profile chromatin accessibility (ATAC-seq) and gene expression (RNA-seq) in the same cell and show that it dramatically outperforms existing methods for inferring regulatory locus-gene links, achieving up to 71% greater agreement with independent population genetics-based estimates. By extending Granger causality to DAG-structured dynamical systems, our work unlocks new domains for causal analyses and, more specifically, opens a path towards elucidating gene regulatory interactions relevant to cellular differentiation and complex human diseases at unprecedented scale and resolution.

LGOct 20, 2022
Causally-guided Regularization of Graph Attention Improves Generalizability

Alexander P. Wu, Thomas Markovich, Bonnie Berger et al.

Graph attention networks estimate the relational importance of node neighbors to aggregate relevant information over local neighborhoods for a prediction task. However, the inferred attentions are vulnerable to spurious correlations and connectivity in the training data, hampering the generalizability of the model. We introduce CAR, a general-purpose regularization framework for graph attention networks. Embodying a causal inference approach, CAR aligns the attention mechanism with the causal effects of active interventions on graph connectivity in a scalable manner. CAR is compatible with a variety of graph attention architectures, and we show that it systematically improves generalizability on various node classification tasks. Our ablation studies indicate that CAR hones in on the aspects of graph structure most pertinent to the prediction (e.g., homophily), and does so more effectively than alternative approaches. Finally, we also show that CAR enhances interpretability of attention weights by accentuating node-neighbor relations that point to causal hypotheses. For social media network-sized graphs, a CAR-guided graph rewiring approach could allow us to combine the scalability of graph convolutional methods with the higher performance of graph attention.

18.7ITApr 13
ISAC-Enabled Non-Terrestrial Networks for 6G: Design Principles, Standardization, Performance Tradeoffs, and Use Cases

Muhammad Ali Jamshed, Rohit Singh, Malik Muhammad Saad et al.

Non-Terrestrial Networks (NTN) have emerged as a key enabler to fully realize the vision of integrated, intelligent, and ubiquitous connectivity in 6G systems. However, several operational challenges, including severe Doppler effects, interference, and latency, hinder the seamless integration of NTN and Terrestrial Networks (TN). In this context, Integrated Sensing and Communication (ISAC), which unifies sensing and communication functionalities within a common framework, offers great potential to address these challenges while enabling new network capabilities. Due to its complementary functionalities, ISAC can play a pivotal role in enhancing NTN performance, although its practical adoption requires a fundamental rethinking of existing architectural and standardization frameworks. Motivated by this need, this article examines key aspects of ISAC-enabled NTN, including architectural design principles, application scenarios, standardization challenges, and key performance tradeoffs. Finally, a representative case study is presented to illustrate major technical challenges and highlight promising future research directions for ISAC-enabled NTN.

61.1SPMay 12
Enabling AI-Native Mobility in 6G: A Real-World Dataset for Handover, Beam Management, and Timing Advance

Mannam Veera Narayana, Rohit Singh, Deepa M. R et al.

To address the issues of high interruption time and measurement report overhead under user equipment (UE) mobility especially in high speed 5G use cases the use of AI/ML techniques (AI/ML beam management and mobility procedures) have been proposed. These techniques rely heavily on data that are most often simulated for various scenarios and do not accurately reflect real deployment behavior or user traffic patterns. Therefore, there is an utmost need for realistic datasets under various conditions. This work presents a dataset collected from a commercially deployed network across various modes of mobility (pedestrian, bike, car, bus, and train) and at multiple speeds to depict real time UE mobility. When collecting the dataset, we focused primarily on handover (HO) scenarios, with the aim of reducing the HO interruption time and maintaining continuous throughput during and immediately after HO execution. To support this research, the dataset includes timing advance (TA) measurements at various signaling events such as RACH trigger, MAC CE, and PDCCH grant which are typically missing in existing works. We cover a detailed description of the creation of the dataset; experimental setup, data acquisition, and extraction. We also cover an exploratory analysis of the data, with a primary focus on mobility, beam management, and TA. We discuss multiple use cases in which the proposed dataset can facilitate understanding of the inference of the AI/ML model. One such use case is to train and evaluate various AI/ML models for TA prediction.

LGMar 8Code
Reverse Distillation: Consistently Scaling Protein Language Model Representations

Darius Catrina, Christian Bepler, Samuel Sledzieski et al.

Unlike the predictable scaling laws in natural language processing and computer vision, protein language models (PLMs) scale poorly: for many tasks, models within the same family plateau or even decrease in performance, with mid-sized models often outperforming the largest in the family. We introduce Reverse Distillation, a principled framework that decomposes large PLM representations into orthogonal subspaces guided by smaller models of the same family. The resulting embeddings have a nested, Matryoshka-style structure: the first k dimensions of a larger model's embedding are exactly the representation from the smaller model. This ensures that larger reverse-distilled models consistently outperform smaller ones. A motivating intuition is that smaller models, constrained by capacity, preferentially encode broadly-shared protein features. Reverse distillation isolates these shared features and orthogonally extracts additional contributions from larger models, preventing interference between the two. On ProteinGym benchmarks, reverse-distilled ESM-2 variants outperform their respective baselines at the same embedding dimensionality, with the reverse-distilled 15 billion parameter model achieving the strongest performance. Our framework is generalizable to any model family where scaling challenges persist. Code and trained models are available at https://github.com/rohitsinghlab/plm_reverse_distillation.

ITJan 12, 2024
Enhancements for 5G NR PRACH Reception: An AI/ML Approach

Rohit Singh, Anil Kumar Yerrapragada, Jeeva Keshav S et al.

Random Access is an important step in enabling the initial attachment of a User Equipment (UE) to a Base Station (gNB). The UE identifies itself by embedding a Preamble Index (RAPID) in the phase rotation of a known base sequence, which it transmits on the Physical Random Access Channel (PRACH). The signal on the PRACH also enables the estimation of propagation delay, often known as Timing Advance (TA), which is induced by virtue of the UE's position. Traditional receivers estimate the RAPID and TA using correlation-based techniques. This paper presents an alternative receiver approach that uses AI/ML models, wherein two neural networks are proposed, one for the RAPID and one for the TA. Different from other works, these two models can run in parallel as opposed to sequentially. Experiments with both simulated data and over-the-air hardware captures highlight the improved performance of the proposed AI/ML-based techniques compared to conventional correlation methods.

SPNov 3, 2024
A Machine Learning based Hybrid Receiver for 5G NR PRACH

Rohit Singh, Anil Kumar Yerrapragada, Radha Krishna Ganti

Random Access is a critical procedure using which a User Equipment (UE) identifies itself to a Base Station (BS). Random Access starts with the UE transmitting a random preamble on the Physical Random Access Channel (PRACH). In a conventional BS receiver, the UE's specific preamble is identified by correlation with all the possible preambles. The PRACH signal is also used to estimate the timing advance which is induced by propagation delay. Correlation-based receivers suffer from false peaks and missed detection in scenarios dominated by high fading and low signal-to-noise ratio. This paper describes the design of a hybrid receiver that consists of an AI/ML model for preamble detection followed by conventional peak detection for the Timing Advance estimation. The proposed receiver combines the Power Delay Profiles of correlation windows across multiple antennas and uses the combination as input to a Neural Network model. The model predicts the presence or absence of a user in a particular preamble window, after which the timing advance is estimated by peak detection. Results show superior performance of the hybrid receiver compared to conventional receivers both for simulated and real hardware-captured datasets.

CRFeb 18, 2021
Consenting to Internet of Things Across Different Social Settings

Yasasvi Hari, Rohit Singh, Kizito Nyuytiymbiy et al.

Devices connected to the Internet of Things (IoT) are rapidly becoming ubiquitous across modern homes, workplaces, and other social environments. While these devices provide users with extensive functionality, they pose significant privacy concerns due to difficulties in consenting to these devices. In this work, we present the results of a pilot study that shows how users consent to devices in common locations at a friends house in which the user is a guest attending a party. We use this pilot study to indicate a direction for a larger study, which will capture a more granular understanding of how users will consent to a variety of devices placed in different social settings (i.e. a party house owned by a friend, an office space for the user and some 40 other employees, the bathroom of a department store). Our final contribution of this work will be to build a probability distribution which will indicate how probable a given user is to consent to a device given what sensors it has, where it is, and the awareness and preferences of each user.

NISep 22, 2020
Ultra-dense Low Data Rate (UDLD) Communication in the THz

Rohit Singh, Doug Sicker

In the future, with the advent of Internet of Things (IoT), wireless sensors, and multiple 5G killer applications, an indoor room might be filled with $1000$s of devices demanding low data rates. Such high-level densification and mobility of these devices will overwhelm the system and result in higher interference, frequent outages, and lower coverage. The THz band has a massive amount of greenfield spectrum to cater to this dense-indoor deployment. However, a limited coverage range of the THz will require networks to have more infrastructure and depend on non-line-of-sight (NLOS) type communication. This form of communication might not be profitable for network operators and can even result in inefficient resource utilization for devices demanding low data rates. Using distributed device-to-device (D2D) communication in the THz, we can cater to these Ultra-dense Low Data Rate (UDLD) type applications. D2D in THz can be challenging, but with opportunistic allocation and smart learning algorithms, these challenges can be mitigated. We propose a 2-Layered distributed D2D model, where devices use coordinated multi-agent reinforcement learning (MARL) to maximize efficiency and user coverage for dense-indoor deployment. We show that densification and mobility in a network can be used to further the limited coverage range of THz devices, without the need for extra infrastructure or resources.

NINov 13, 2019
MOTH- Mobility-induced Outages in THz: A Beyond 5G (B5G) application

Rohit Singh, Douglas Sicker, Kazi Mohammed Saidul Huq

5G will enable the growing demand for Internet of Things (IoT), high-resolution video streaming, and low latency wireless services. Demand for such services is expected to growth rapid, which will require a search for Beyond 5G technological advancements in wireless communications. Part of these advancements is the need for additional spectrum, namely moving toward the terahertz (THz) range. To compensate for the high path loss in THz, narrow beamwidths are used to improve antenna gains. However, with narrow beamwidths, even minor fluctuations in device location (such as through body movement) can cause frequent link failures due to beam misalignment. In this paper, we provide a solution to these small-scale indoor movement that result in mobility-induced outages. Like a moth randomly flutters about, Mobility-induced Outages in THz (MOTH) can be ephemeral in nature and hard to avoid. To deal with MOTH we propose two methods to predict these outage scenarios: (i) Align-After-Failure (AAF), which predicts based on fixed time margins, and (ii) Align-Before-Failure (ABF), which learns the time margins through user mobility patterns. In this paper, two different online classifiers were used to train the ABF model to predicate if a mobility-induced outage is going to occur; thereby, significantly reducing the time spent in outage scenarios. Simulation results demonstrate a relationship between optimal beamwidth and human mobility patterns. Additionally, to cater to a future with dense deployment of Wireless Personal Area Network (WPAN), it is necessary that we have efficient deployment of resources (e.g., THz-APs). One solution is to maximize the user coverage for a single AP, which might be dependent on multiple parameters. We identify these parameters and observe their tradeoffs for improving user coverage through a single THz-AP.

APMar 8, 2019
Stay Ahead of Poachers: Illegal Wildlife Poaching Prediction and Patrol Planning Under Uncertainty with Field Test Evaluations

Lily Xu, Shahrzad Gholami, Sara Mc Carthy et al.

Illegal wildlife poaching threatens ecosystems and drives endangered species toward extinction. However, efforts for wildlife protection are constrained by the limited resources of law enforcement agencies. To help combat poaching, the Protection Assistant for Wildlife Security (PAWS) is a machine learning pipeline that has been developed as a data-driven approach to identify areas at high risk of poaching throughout protected areas and compute optimal patrol routes. In this paper, we take an end-to-end approach to the data-to-deployment pipeline for anti-poaching. In doing so, we address challenges including extreme class imbalance (up to 1:200), bias, and uncertainty in wildlife poaching data to enhance PAWS, and we apply our methodology to three national parks with diverse characteristics. (i) We use Gaussian processes to quantify predictive uncertainty, which we exploit to improve robustness of our prescribed patrols and increase detection of snares by an average of 30%. We evaluate our approach on real-world historical poaching data from Murchison Falls and Queen Elizabeth National Parks in Uganda and, for the first time, Srepok Wildlife Sanctuary in Cambodia. (ii) We present the results of large-scale field tests conducted in Murchison Falls and Srepok Wildlife Sanctuary which confirm that the predictive power of PAWS extends promisingly to multiple parks. This paper is part of an effort to expand PAWS to 800 parks around the world through integration with SMART conservation software.

LGJan 17, 2019
Applying SVGD to Bayesian Neural Networks for Cyclical Time-Series Prediction and Inference

Xinyu Hu, Paul Szerlip, Theofanis Karaletsos et al.

A regression-based BNN model is proposed to predict spatiotemporal quantities like hourly rider demand with calibrated uncertainties. The main contributions of this paper are (i) A feed-forward deterministic neural network (DetNN) architecture that predicts cyclical time series data with sensitivity to anomalous forecasting events; (ii) A Bayesian framework applying SVGD to train large neural networks for such tasks, capable of producing time series predictions as well as measures of uncertainty surrounding the predictions. Experiments show that the proposed BNN reduces average estimation error by 10% across 8 U.S. cities compared to a fine-tuned multilayer perceptron (MLP), and 4% better than the same network architecture trained without SVGD.

MANov 6, 2018
Deep Reinforcement Learning for Green Security Games with Real-Time Information

Yufei Wang, Zheyuan Ryan Shi, Lantao Yu et al.

Green Security Games (GSGs) have been proposed and applied to optimize patrols conducted by law enforcement agencies in green security domains such as combating poaching, illegal logging and overfishing. However, real-time information such as footprints and agents' subsequent actions upon receiving the information, e.g., rangers following the footprints to chase the poacher, have been neglected in previous work. To fill the gap, we first propose a new game model GSG-I which augments GSGs with sequential movement and the vital element of real-time information. Second, we design a novel deep reinforcement learning-based algorithm, DeDOL, to compute a patrolling strategy that adapts to the real-time information against a best-responding attacker. DeDOL is built upon the double oracle framework and the policy-space response oracle, solving a restricted game and iteratively adding best response strategies to it through training deep Q-networks. Exploring the game structure, DeDOL uses domain-specific heuristic strategies as initial strategies and constructs several local modes for efficient and parallelized training. To our knowledge, this is the first attempt to use Deep Q-Learning for security games.

LGOct 18, 2018
Pyro: Deep Universal Probabilistic Programming

Eli Bingham, Jonathan P. Chen, Martin Jankowiak et al.

Pyro is a probabilistic programming language built on Python as a platform for developing advanced probabilistic models in AI research. To scale to large datasets and high-dimensional models, Pyro uses stochastic variational inference algorithms and probability distributions built on top of PyTorch, a modern GPU-accelerated deep learning framework. To accommodate complex or model-specific algorithmic behavior, Pyro leverages Poutine, a library of composable building blocks for modifying the behavior of probabilistic programs.

GTMay 5, 2018
Designing the Game to Play: Optimizing Payoff Structure in Security Games

Zheyuan Ryan Shi, Ziye Tang, Long Tran-Thanh et al.

Effective game-theoretic modeling of defender-attacker behavior is becoming increasingly important. In many domains, the defender functions not only as a player but also the designer of the game's payoff structure. We study Stackelberg Security Games where the defender, in addition to allocating defensive resources to protect targets from the attacker, can strategically manipulate the attacker's payoff under budget constraints in weighted L^p-norm form regarding the amount of change. Focusing on problems with weighted L^1-norm form constraint, we present (i) a mixed integer linear program-based algorithm with approximation guarantee; (ii) a branch-and-bound based algorithm with improved efficiency achieved by effective pruning; (iii) a polynomial time approximation scheme for a special but practical class of problems. In addition, we show that problems under budget constraints in L^0-norm form and weighted L^\infty-norm form can be solved in polynomial time. We provide an extensive experimental evaluation of our proposed algorithms.

CRApr 5, 2013
Data Hiding in Binary Image using Block Parity

Sipendra Sinha, Amol Gaikwad, Deepak Kumar et al.

Secret data hiding in binary images is more difficult than other formats since binary images require only one bit representation to indicate black and white. This study proposes a new method for data hiding in binary images using optimized bit position to replace a secret bit. This method manipulates blocks, which are sub-divided. The parity bit for a specified block decides whether to change or not, to embed a secret bit. By finding the best position to insert a secret bit for each divided block, the image quality of the resulting stego-image can be improved, while maintaining low computational complexity.The experimental results show that the proposed method has an improvement with respect to a previous work.