Bhaskar Krishnamachari

LG
Semantic Scholar Profile
h-index16
60papers
1,032citations
Novelty39%
AI Score53

60 Papers

LGJul 28, 2023
Holistic Survey of Privacy and Fairness in Machine Learning

Sina Shaham, Arash Hajisafi, Minh K Quan et al. · amazon-science

Privacy and fairness are two crucial pillars of responsible Artificial Intelligence (AI) and trustworthy Machine Learning (ML). Each objective has been independently studied in the literature with the aim of reducing utility loss in achieving them. Despite the significant interest attracted from both academia and industry, there remains an immediate demand for more in-depth research to unravel how these two objectives can be simultaneously integrated into ML models. As opposed to well-accepted trade-offs, i.e., privacy-utility and fairness-utility, the interrelation between privacy and fairness is not well-understood. While some works suggest a trade-off between the two objective functions, there are others that demonstrate the alignment of these functions in certain scenarios. To fill this research gap, we provide a thorough review of privacy and fairness in ML, including supervised, unsupervised, semi-supervised, and reinforcement learning. After examining and consolidating the literature on both objectives, we present a holistic survey on the impact of privacy on fairness, the impact of fairness on privacy, existing architectures, their interaction in application domains, and algorithms that aim to achieve both objectives while minimizing the utility sacrificed. Finally, we identify research challenges in achieving privacy and fairness concurrently in ML, particularly focusing on large language models.

OCApr 3, 2011
LIFO-Backpressure Achieves Near Optimal Utility-Delay Tradeoff

Longbo Huang, Scott Moeller, Michael J. Neely et al.

There has been considerable recent work developing a new stochastic network utility maximization framework using Backpressure algorithms, also known as MaxWeight. A key open problem has been the development of utility-optimal algorithms that are also delay efficient. In this paper, we show that the Backpressure algorithm, when combined with the LIFO queueing discipline (called LIFO-Backpressure), is able to achieve a utility that is within $O(1/V)$ of the optimal value, while maintaining an average delay of $O([\log(V)]^2)$ for all but a tiny fraction of the network traffic. This result holds for general stochastic network optimization problems and general Markovian dynamics. Remarkably, the performance of LIFO-Backpressure can be achieved by simply changing the queueing discipline; it requires no other modifications of the original Backpressure algorithm. We validate the results through empirical measurements from a sensor network testbed, which show good match between theory and practice.

NIAug 19, 2011
Backpressure with Adaptive Redundancy (BWAR)

Majed Alresaini, Maheswaran Sathiamoorthy, Bhaskar Krishnamachari et al.

Backpressure scheduling and routing, in which packets are preferentially transmitted over links with high queue differentials, offers the promise of throughput-optimal operation for a wide range of communication networks. However, when the traffic load is low, due to the corresponding low queue occupancy, backpressure scheduling/routing experiences long delays. This is particularly of concern in intermittent encounter-based mobile networks which are already delay-limited due to the sparse and highly dynamic network connectivity. While state of the art mechanisms for such networks have proposed the use of redundant transmissions to improve delay, they do not work well when the traffic load is high. We propose in this paper a novel hybrid approach that we refer to as backpressure with adaptive redundancy (BWAR), which provides the best of both worlds. This approach is highly robust and distributed and does not require any prior knowledge of network load conditions. We evaluate BWAR through both mathematical analysis and simulations based on cell-partitioned model. We prove theoretically that BWAR does not perform worse than traditional backpressure in terms of the maximum throughput, while yielding a better delay bound. The simulations confirm that BWAR outperforms traditional backpressure at low load, while outperforming a state of the art encounter-routing scheme (Spray and Wait) at high load.

OCMar 20, 2011
On the Combinatorial Multi-Armed Bandit Problem with Markovian Rewards

Yi Gai, Bhaskar Krishnamachari, Mingyan Liu

We consider a combinatorial generalization of the classical multi-armed bandit problem that is defined as follows. There is a given bipartite graph of $M$ users and $N \geq M$ resources. For each user-resource pair $(i,j)$, there is an associated state that evolves as an aperiodic irreducible finite-state Markov chain with unknown parameters, with transitions occurring each time the particular user $i$ is allocated resource $j$. The user $i$ receives a reward that depends on the corresponding state each time it is allocated the resource $j$. The system objective is to learn the best matching of users to resources so that the long-term sum of the rewards received by all users is maximized. This corresponds to minimizing regret, defined here as the gap between the expected total reward that can be obtained by the best-possible static matching and the expected total reward that can be achieved by a given algorithm. We present a polynomial-storage and polynomial-complexity-per-step matching-learning algorithm for this problem. We show that this algorithm can achieve a regret that is uniformly arbitrarily close to logarithmic in time and polynomial in the number of users and resources. This formulation is broadly applicable to scheduling and switching problems in networks and significantly extends prior results in the area.

DCMay 1
ncsim: A Lightweight Simulator for Networked Edge Computing with Wireless Interference Modeling

Bhaskar Krishnamachari, Maya Gutierrez, Jared Coleman

Evaluating DAG task schedulers for wireless edge computing requires jointly modeling compute placement and wireless interference, yet existing tools treat them in isolation. This gap leads to rank inversions: the scheduler that appears optimal under an interference-free model can be the worst choice under realistic wireless conditions. We present ncsim, a lightweight discrete-event simulator that bridges this gap by combining DAG workflow scheduling with physically-grounded IEEE 802.11 CSMA/CA interference modeling in a single Python package. A 108-run factorial experiment reveals rank inversions in 27.8% of scenarios, with the interference-free-optimal scheduler producing up to 2.7x worse makespan than a simple round-robin baseline; scaling to a 100-node random geometric graph raises the inversion rate to 50%. These rank inversions show that interference-free evaluation can select the wrong algorithm entirely, justifying the design and use of ncsim.

LGSep 9, 2011
Online Learning Algorithms for Stochastic Water-Filling

Yi Gai, Bhaskar Krishnamachari

Water-filling is the term for the classic solution to the problem of allocating constrained power to a set of parallel channels to maximize the total data-rate. It is used widely in practice, for example, for power allocation to sub-carriers in multi-user OFDM systems such as WiMax. The classic water-filling algorithm is deterministic and requires perfect knowledge of the channel gain to noise ratios. In this paper we consider how to do power allocation over stochastically time-varying (i.i.d.) channels with unknown gain to noise ratio distributions. We adopt an online learning framework based on stochastic multi-armed bandits. We consider two variations of the problem, one in which the goal is to find a power allocation to maximize $\sum\limits_i \mathbb{E}[\log(1 + SNR_i)]$, and another in which the goal is to find a power allocation to maximize $\sum\limits_i \log(1 + \mathbb{E}[SNR_i])$. For the first problem, we propose a \emph{cognitive water-filling} algorithm that we call CWF1. We show that CWF1 obtains a regret (defined as the cumulative gap over time between the sum-rate obtained by a distribution-aware genie and this policy) that grows polynomially in the number of channels and logarithmically in time, implying that it asymptotically achieves the optimal time-averaged rate that can be obtained when the gain distributions are known. For the second problem, we present an algorithm called CWF2, which is, to our knowledge, the first algorithm in the literature on stochastic multi-armed bandits to exploit non-linear dependencies between the arms. We prove that the number of times CWF2 picks the incorrect power allocation is bounded by a function that is polynomial in the number of channels and logarithmic in time, implying that its frequency of incorrect allocation tends to zero.

LGSep 7, 2011
Efficient Online Learning for Opportunistic Spectrum Access

Wenhan Dai, Yi Gai, Bhaskar Krishnamachari

The problem of opportunistic spectrum access in cognitive radio networks has been recently formulated as a non-Bayesian restless multi-armed bandit problem. In this problem, there are N arms (corresponding to channels) and one player (corresponding to a secondary user). The state of each arm evolves as a finite-state Markov chain with unknown parameters. At each time slot, the player can select K < N arms to play and receives state-dependent rewards (corresponding to the throughput obtained given the activity of primary users). The objective is to maximize the expected total rewards (i.e., total throughput) obtained over multiple plays. The performance of an algorithm for such a multi-armed bandit problem is measured in terms of regret, defined as the difference in expected reward compared to a model-aware genie who always plays the best K arms. In this paper, we propose a new continuous exploration and exploitation (CEE) algorithm for this problem. When no information is available about the dynamics of the arms, CEE is the first algorithm to guarantee near-logarithmic regret uniformly over time. When some bounds corresponding to the stationary state distributions and the state-dependent rewards are known, we show that CEE can be easily modified to achieve logarithmic regret over time. In contrast, prior algorithms require additional information concerning bounds on the second eigenvalues of the transition matrices in order to guarantee logarithmic regret. Finally, we show through numerical simulations that CEE is more efficient than prior algorithms.

LGNov 28, 2022
QLAMMP: A Q-Learning Agent for Optimizing Fees on Automated Market Making Protocols

Dev Churiwala, Bhaskar Krishnamachari

Automated Market Makers (AMMs) have cemented themselves as an integral part of the decentralized finance (DeFi) space. AMMs are a type of exchange that allows users to trade assets without the need for a centralized exchange. They form the foundation for numerous decentralized exchanges (DEXs), which help facilitate the quick and efficient exchange of on-chain tokens. All present-day popular DEXs are static protocols, with fixed parameters controlling the fee and the curvature - they suffer from invariance and cannot adapt to quickly changing market conditions. This characteristic may cause traders to stay away during high slippage conditions brought about by intractable market movements. We propose an RL framework to optimize the fees collected on an AMM protocol. In particular, we develop a Q-Learning Agent for Market Making Protocols (QLAMMP) that learns the optimal fee rates and leverage coefficients for a given AMM protocol and maximizes the expected fee collected under a range of different market conditions. We show that QLAMMP is consistently able to outperform its static counterparts under all the simulated test conditions.

LGSep 21, 2023
Trip Planning for Autonomous Vehicles with Wireless Data Transfer Needs Using Reinforcement Learning

Yousef AlSaqabi, Bhaskar Krishnamachari

With recent advancements in the field of communications and the Internet of Things, vehicles are becoming more aware of their environment and are evolving towards full autonomy. Vehicular communication opens up the possibility for vehicle-to-infrastructure interaction, where vehicles could share information with components such as cameras, traffic lights, and signage that support a countrys road system. As a result, vehicles are becoming more than just a means of transportation; they are collecting, processing, and transmitting massive amounts of data used to make driving safer and more convenient. With 5G cellular networks and beyond, there is going to be more data bandwidth available on our roads, but it may be heterogeneous because of limitations like line of sight, infrastructure, and heterogeneous traffic on the road. This paper addresses the problem of route planning for autonomous vehicles in urban areas accounting for both driving time and data transfer needs. We propose a novel reinforcement learning solution that prioritizes high bandwidth roads to meet a vehicles data transfer requirement, while also minimizing driving time. We compare this approach to traffic-unaware and bandwidth-unaware baselines to show how much better it performs under heterogeneous traffic. This solution could be used as a starting point to understand what good policies look like, which could potentially yield faster, more efficient heuristics in the future.

AISep 2, 2022
Learning Practical Communication Strategies in Cooperative Multi-Agent Reinforcement Learning

Diyi Hu, Chi Zhang, Viktor Prasanna et al.

In Multi-Agent Reinforcement Learning, communication is critical to encourage cooperation among agents. Communication in realistic wireless networks can be highly unreliable due to network conditions varying with agents' mobility, and stochasticity in the transmission process. We propose a framework to learn practical communication strategies by addressing three fundamental questions: (1) When: Agents learn the timing of communication based on not only message importance but also wireless channel conditions. (2) What: Agents augment message contents with wireless network measurements to better select the game and communication actions. (3) How: Agents use a novel neural message encoder to preserve all information from received messages, regardless of the number and order of messages. Simulating standard benchmarks under realistic wireless network settings, we show significant improvements in game performance, convergence speed and communication efficiency compared with state-of-the-art.

SENov 13, 2023
Testing learning-enabled cyber-physical systems with Large-Language Models: A Formal Approach

Xi Zheng, Aloysius K. Mok, Ruzica Piskac et al.

The integration of machine learning (ML) into cyber-physical systems (CPS) offers significant benefits, including enhanced efficiency, predictive capabilities, real-time responsiveness, and the enabling of autonomous operations. This convergence has accelerated the development and deployment of a range of real-world applications, such as autonomous vehicles, delivery drones, service robots, and telemedicine procedures. However, the software development life cycle (SDLC) for AI-infused CPS diverges significantly from traditional approaches, featuring data and learning as two critical components. Existing verification and validation techniques are often inadequate for these new paradigms. In this study, we pinpoint the main challenges in ensuring formal safety for learningenabled CPS.We begin by examining testing as the most pragmatic method for verification and validation, summarizing the current state-of-the-art methodologies. Recognizing the limitations in current testing approaches to provide formal safety guarantees, we propose a roadmap to transition from foundational probabilistic testing to a more rigorous approach capable of delivering formal assurance.

OCSep 7, 2011
The Non-Bayesian Restless Multi-Armed Bandit: A Case of Near-Logarithmic Strict Regret

Wenhan Dai, Yi Gai, Bhaskar Krishnamachari et al.

In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are $N$ arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A player seeks to activate $K \geq 1$ arms at each time in order to maximize the expected total reward obtained over multiple plays. RMAB is a challenging problem that is known to be PSPACE-hard in general. We consider in this work the even harder non-Bayesian RMAB, in which the parameters of the Markov chain are assumed to be unknown \emph{a priori}. We develop an original approach to this problem that is applicable when the corresponding Bayesian problem has the structure that, depending on the known parameter values, the optimal solution is one of a prescribed finite set of policies. In such settings, we propose to learn the optimal policy for the non-Bayesian RMAB by employing a suitable meta-policy which treats each policy from this finite set as an arm in a different non-Bayesian multi-armed bandit problem for which a single-arm selection policy is optimal. We demonstrate this approach by developing a novel sensing policy for opportunistic spectrum access over unknown dynamic channels. We prove that our policy achieves near-logarithmic regret (the difference in expected reward compared to a model-aware genie), which leads to the same average reward that can be achieved by the optimal policy under a known model. This is the first such result in the literature for a non-Bayesian RMAB. For our proof, we also develop a novel generalization of the Chernoff-Hoeffding bound.

CRAug 24, 2024
Differentially Private Publication of Electricity Time Series Data in Smart Grids

Sina Shaham, Gabriel Ghinita, Bhaskar Krishnamachari et al.

Smart grids are a valuable data source to study consumer behavior and guide energy policy decisions. In particular, time-series of power consumption over geographical areas are essential in deciding the optimal placement of expensive resources (e.g., transformers, storage elements) and their activation schedules. However, publication of such data raises significant privacy issues, as it may reveal sensitive details about personal habits and lifestyles. Differential privacy (DP) is well-suited for sanitization of individual data, but current DP techniques for time series lead to significant loss in utility, due to the existence of temporal correlation between data readings. We introduce {\em STPT (Spatio-Temporal Private Timeseries)}, a novel method for DP-compliant publication of electricity consumption data that analyzes spatio-temporal attributes and captures both micro and macro patterns by leveraging RNNs. Additionally, it employs a partitioning method for releasing electricity consumption time series based on identified patterns. We demonstrate through extensive experiments, on both real-world and synthetic datasets, that STPT significantly outperforms existing benchmarks, providing a well-balanced trade-off between data utility and user privacy.

LGMar 25, 2022
Multi-modal Misinformation Detection: Approaches, Challenges and Opportunities

Sara Abdali, Sina shaham, Bhaskar Krishnamachari

As social media platforms are evolving from text-based forums into multi-modal environments, the nature of misinformation in social media is also transforming accordingly. Taking advantage of the fact that visual modalities such as images and videos are more favorable and attractive to the users and textual contents are sometimes skimmed carelessly, misinformation spreaders have recently targeted contextual connections between the modalities e.g., text and image. Hence many researchers have developed automatic techniques for detecting possible cross-modal discordance in web-based content. We analyze, categorize and identify existing approaches in addition to challenges and shortcomings they face in order to unearth new research opportunities in the field of multi-modal misinformation detection.

TRMay 7
Who Restores the Peg? A Mean-Field Game Approach to Model Stablecoin Market Dynamics

Hardhik Mohanty, Bhaskar Krishnamachari

USDC and USDT are the dominant stablecoins pegged to \$1 with a total market capitalization of over \$300B and rising. Stablecoins make dollar value globally accessible with secure transfer and settlement. Yet in practice, these stablecoins experience periods of stress and de-pegging from their \$1 target, posing significant systemic risks. The behavior of market participants during these stress events and the collective actions that either restore or break the peg are not well understood. This paper addresses the question: who restores the peg?. We develop a dynamic, agent-based mean-field game framework for fiat-collateralized stablecoins, in which a large population of arbitrageurs and retail traders strategically interact across primary and secondary markets during a de-peg episode. The key advantage of this equilibrium formulation is that it endogenously maps market frictions into a market-clearing price path and implied net order flows, allowing us to attribute peg-reverting pressure by channel and to stress-test when a given infrastructure becomes insufficient for recovery. Using three historical de-peg events, we show that the calibrated equilibrium reproduces observed recovery half-lives and yields an order flow decomposition in which system-wide stress is predominantly stabilized by primary-market arbitrage. Finally, a quantitative sensitivity analysis identifies a non-linear breakdown threshold, beyond which a de-peg becomes markedly slower to reverse.

CLDec 1, 2022
a survey on GPT-3

Mingyu Zong, Bhaskar Krishnamachari

This paper provides an introductory survey to GPT-3. We cover some of the historical development behind this technology, some of the key features of GPT-3, and discuss the machine learning model and the datasets used. We survey both academic and commercial efforts applying GPT-3 in diverse domains such as developing conversational AI chatbots, software development, creative work, domain knowledge, and business productivity. We discuss some of the challenges that GPT-3 faces such as the problems of training complexity, bias, and hallucination/incorrect answers. We also discuss the future research opportunities in this area.

CRApr 10
A Survey on the Applications of Zero-Knowledge Proofs

Ryan Lavin, Xuekai Liu, Hardhik Mohanty et al.

Zero-knowledge proofs (ZKPs) enable computational integrity and privacy by allowing one party to prove the truth of a statement without revealing underlying data. Compared with alternatives such as homomorphic encryption and secure multiparty computation, ZKPs offer distinct advantages in universality and minimal trust assumptions, with applications spanning blockchain systems and confidential verification of computational tasks. This survey provides a technical overview of ZKPs with a focus on an increasingly relevant subset called zkSNARKs. Unlike prior surveys emphasizing algorithmic and theoretical aspects, we take a broader view of practical deployments and recent use cases across multiple domains including blockchain privacy, scaling, storage, and interoperability, as well as non-blockchain applications such as voting, authentication, timelocks, and machine learning. To support consistent comparison, we provide (i) a taxonomy of application areas, (ii) evaluation criteria including proof size, prover and verifier time, memory, and setup assumptions, and (iii) comparative tables summarizing key tradeoffs and representative systems. The survey also covers supporting infrastructure, including zero-knowledge virtual machines, domain-specific languages, libraries, and frameworks. While emphasizing zkSNARKs for their prevalence in deployed systems, we compare them with zkSTARKs and Bulletproofs to clarify transparency and performance tradeoffs. We conclude with future research and application directions.

NIOct 25, 2024Code
Artificial Intelligence of Things: A Survey

Shakhrul Iman Siam, Hyunho Ahn, Li Liu et al.

The integration of the Internet of Things (IoT) and modern Artificial Intelligence (AI) has given rise to a new paradigm known as the Artificial Intelligence of Things (AIoT). In this survey, we provide a systematic and comprehensive review of AIoT research. We examine AIoT literature related to sensing, computing, and networking & communication, which form the three key components of AIoT. In addition to advancements in these areas, we review domain-specific AIoT systems that are designed for various important application domains. We have also created an accompanying GitHub repository, where we compile the papers included in this survey: https://github.com/AIoT-MLSys-Lab/AIoT-Survey. This repository will be actively maintained and updated with new research as it becomes available. As both IoT and AI become increasingly critical to our society, we believe AIoT is emerging as an essential research field at the intersection of IoT and modern AI. We hope this survey will serve as a valuable resource for those engaged in AIoT research and act as a catalyst for future explorations to bridge gaps and drive advancements in this exciting field.

LGApr 9
Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning

Diyi Hu, Bhaskar Krishnamachari

Cooperation in multi-agent reinforcement learning (MARL) benefits from inter-agent communication, yet most approaches assume idealized channels and existing value decomposition methods ignore who successfully shared information with whom. We propose CLOVER, a cooperative MARL framework whose centralized value mixer is conditioned on the communication graph realized under a realistic wireless channel. This graph introduces a relational inductive bias into value decomposition, constraining how individual utilities are mixed based on the realized communication structure. The mixer is a GNN with node-specific weights generated by a Permutation-Equivariant Hypernetwork: multi-hop propagation along communication edges reshapes credit assignment so that different topologies induce different mixing. We prove this mixer is permutation invariant, monotonic (preserving the IGM condition), and strictly more expressive than QMIX-style mixers. To handle realistic channels, we formulate an augmented MDP isolating stochastic channel effects from the agent computation graph, and employ a stochastic receptive field encoder for variable-size message sets, enabling end-to-end differentiable training. On Predator-Prey and Lumberjacks benchmarks under p-CSMA wireless channels, CLOVER consistently improves convergence speed and final performance over VDN, QMIX, TarMAC+VDN, and TarMAC+QMIX. Behavioral analysis confirms agents learn adaptive signaling and listening strategies, and ablations isolate the communication-graph inductive bias as the key source of improvement.

LGFeb 3
SATORIS-N: Spectral Analysis based Traffic Observation Recovery via Informed Subspaces and Nuclear-norm minimization

Sampad Mohanty, Bhaskar Krishnamachari

Traffic-density matrices from different days exhibit both low rank and stable correlations in their singular-vector subspaces. Leveraging this, we introduce SATORIS-N, a framework for imputing partially observed traffic-density by informed subspace priors from neighboring days. Our contribution is a subspace-aware semidefinite programming (SDP)} formulation of nuclear norm that explicitly informs the reconstruction with prior singular-subspace information. This convex formulation jointly enforces low rank and subspace alignment, providing a single global optimum and substantially improving accuracy under medium and high occlusion. We also study a lightweight implicit subspace-alignment} strategy in which matrices from consecutive days are concatenated to encourage alignment of spatial or temporal singular directions. Although this heuristic offers modest gains when missing rates are low, the explicit SDP approach is markedly more robust when large fractions of entries are missing. Across two real-world datasets (Beijing and Shanghai), SATORIS-N consistently outperforms standard matrix-completion methods such as SoftImpute, IterativeSVD, statistical, and even deep learning baselines at high occlusion levels. The framework generalizes to other spatiotemporal settings in which singular subspaces evolve slowly over time. In the context of intelligent vehicles and vehicle-to-everything (V2X) systems, accurate traffic-density reconstruction enables critical applications including cooperative perception, predictive routing, and vehicle-to-infrastructure (V2I) communication optimization. When infrastructure sensors or vehicle-reported observations are incomplete - due to communication dropouts, sensor occlusions, or sparse connected vehicle penetration-reliable imputation becomes essential for safe and efficient autonomous navigation.

PLFeb 12
Compiler-Guided Inference-Time Adaptation: Improving GPT-5 Programming Performance in Idris

Minda Li, Bhaskar Krishnamachari

GPT-5, a state of the art large language model from OpenAI, demonstrates strong performance in widely used programming languages such as Python, C++, and Java; however, its ability to operate in low resource or less commonly used languages remains underexplored. This work investigates whether GPT-5 can effectively acquire proficiency in an unfamiliar functional programming language, Idris, through iterative, feedback driven prompting. We first establish a baseline showing that with zero shot prompting the model solves only 22 out of 56 Idris exercises using the platform Exercism, substantially underperforming relative to higher resource languages (45 out of 50 in Python and 35 out of 47 in Erlang). We then evaluate several refinement strategies, including iterative prompting based on platform feedback, augmenting prompts with documentation and error classification guides, and iterative prompting using local compilation errors and failed test cases. Among these approaches, incorporating local compilation errors yields the most substantial improvements. Using this structured, error guided refinement loop, GPT-5 performance increased to an impressive 54 solved problems out of 56. These results suggest that while large language models may initially struggle in low resource settings, structured compiler level feedback can play a critical role in unlocking their capabilities.

LGJan 18, 2022Code
DEFER: Distributed Edge Inference for Deep Neural Networks

Arjun Parthasarathy, Bhaskar Krishnamachari

Modern machine learning tools such as deep neural networks (DNNs) are playing a revolutionary role in many fields such as natural language processing, computer vision, and the internet of things. Once they are trained, deep learning models can be deployed on edge computers to perform classification and prediction on real-time data for these applications. Particularly for large models, the limited computational and memory resources on a single edge device can become the throughput bottleneck for an inference pipeline. To increase throughput and decrease per-device compute load, we present DEFER (Distributed Edge inFERence), a framework for distributed edge inference, which partitions deep neural networks into layers that can be spread across multiple compute nodes. The architecture consists of a single "dispatcher" node to distribute DNN partitions and inference data to respective compute nodes. The compute nodes are connected in a series pattern where each node's computed result is relayed to the subsequent node. The result is then returned to the Dispatcher. We quantify the throughput, energy consumption, network payload, and overhead for our framework under realistic network conditions using the CORE network emulator. We find that for the ResNet50 model, the inference throughput of DEFER with 8 compute nodes is 53% higher and per node energy consumption is 63% lower than single device inference. We further reduce network communication demands and energy consumption using the ZFP serialization and LZ4 compression algorithms. We have implemented DEFER in Python using the TensorFlow and Keras ML libraries, and have released DEFER as an open-source framework to benefit the research community.

DCMay 9
TS-Verkle: A TypeScript Native Verkle Library With On-chain Verifier

Zhikai Li, Xuekai Liu, Boyuan Xu et al.

Blockchain systems face significant scalability challenges due to growing data volumes and increasing transaction demands, necessitating more efficient data structures and verification mechanisms. Verkle trees, a novel data structure combining the efficiency of Merkle trees with the compactness of vector commitments, have gained attention for their potential to optimize blockchain storage and improve scalability. However, their practical implementation, especially at the smart contract level, has remained unexplored. To address these challenges, we present TS-verkle, the first known TypeScript-native implementation of Verkle trees designed for web3 backend compatibility, coupled with a corresponding on-chain verifier written in Solidity. Our work bridges this gap by providing a concrete implementation of Verkle trees and demonstrating their feasibility for on-chain verification. While previous literature suggests Verkle trees should outperform Merkle trees due to their succinct proof size, our empirical evaluation reveals that basic implementations of Verkle trees actually incur higher costs than Merkle trees without advanced optimization techniques. This finding represents a crucial insight for blockchain developers and researchers considering Verkle tree adoption. The paper discusses implementation strategies and performance characteristics while exploring implications for scaling and data availability in decentralized blockchain systems.

MEMay 1
How to Do Statistical Evaluations in ECE/CS Papers: A Practical Playbook for Defensible Results

Bhaskar Krishnamachari

Strong experimental papers in electrical and computer engineering and computer science (ECE/CS), especially in systems, networking, and applied machine learning, rest on more than a single impressive number. They rest on a chain of design, measurement, analysis, and validation choices that, taken together, make a result believable. This tutorial is a compact, example-driven guide to that chain for beginning researchers. We organize it as an evaluation workflow: claim, hypothesis, unit of analysis, baseline, regime sweep, uncertainty estimate, validation check, and reporting. Within that workflow we cover the classical statistical foundations (descriptive statistics, the central limit theorem, normal- and $t$-based confidence intervals, Student's $t$-test, ANOVA, chi-squared and Pearson correlation, linear regression) alongside the modern, distribution-free techniques (the bootstrap, Wilcoxon and Mann--Whitney tests, Cliff's delta) that are usually preferred for ECE/CS data. We also discuss factorial design, randomization and blocking, multiple-comparison correction, latency-specific pitfalls, simulation verification and validation, equivalence-style claims, and reproducibility. A running example, a comparison of two job-scheduling algorithms on simulated workloads with truncated heavy-tailed job sizes, threads through the tutorial, with Python snippets the reader can paste and adapt. The paper closes with a pre-submission checklist; companion student-facing material (project-type translation tables, an evaluation-plan worksheet, exercises, and a worked ``bad evaluation autopsy'') is collected in a separate workbook released alongside this paper.

DCJan 3, 2024
The Internet of Things in the Era of Generative AI: Vision and Challenges

Xin Wang, Zhongwei Wan, Arvin Hekmati et al.

Advancements in Generative AI hold immense promise to push Internet of Things (IoT) to the next level. In this article, we share our vision on IoT in the era of Generative AI. We discuss some of the most important applications of Generative AI in IoT-related domains. We also identify some of the most critical challenges and discuss current gaps as well as promising opportunities on enabling Generative AI for IoT. We hope this article can inspire new research on IoT in the era of Generative AI.

CLMay 14, 2024
LLM-Assisted Rule Based Machine Translation for Low/No-Resource Languages

Jared Coleman, Bhaskar Krishnamachari, Khalil Iskarous et al.

We propose a new paradigm for machine translation that is particularly useful for no-resource languages (those without any publicly available bilingual or monolingual corpora): LLM-RBMT (LLM-Assisted Rule Based Machine Translation). Using the LLM-RBMT paradigm, we design the first language education/revitalization-oriented machine translator for Owens Valley Paiute (OVP), a critically endangered Indigenous American language for which there is virtually no publicly available data. We present a detailed evaluation of the translator's components: a rule-based sentence builder, an OVP to English translator, and an English to OVP translator. We also discuss the potential of the paradigm, its limitations, and the many avenues for future research that it opens up.

AIOct 25, 2024
Integrating Large Language Models with Internet of Things Applications

Mingyu Zong, Arvin Hekmati, Michael Guastalla et al.

This paper identifies and analyzes applications in which Large Language Models (LLMs) can make Internet of Things (IoT) networks more intelligent and responsive through three case studies from critical topics: DDoS attack detection, macroprogramming over IoT systems, and sensor data processing. Our results reveal that the GPT model under few-shot learning achieves 87.6% detection accuracy, whereas the fine-tuned GPT increases the value to 94.9%. Given a macroprogramming framework, the GPT model is capable of writing scripts using high-level functions from the framework to handle possible incidents. Moreover, the GPT model shows efficacy in processing a vast amount of sensor data by offering fast and high-quality responses, which comprise expected results and summarized insights. Overall, the model demonstrates its potential to power a natural language interface. We hope that researchers will find these case studies inspiring to develop further.

ITApr 3
From Gaussian Fading to Gilbert-Elliott: Bridging Physical and Link-Layer Channel Models in Closed Form

Bhaskar Krishnamachari, Victor Gutierrez

Dynamic fading channels are modeled at two fundamentally different levels of abstraction. At the physical layer, the standard representation is a correlated Gaussian process, such as the dB-domain signal power in log-normal shadow fading. At the link layer, the dominant abstraction is the Gilbert-Elliott (GE) two-state Markov chain, which compresses the channel into a binary ``decodable or not'' sequence with temporal memory. Both models are ubiquitous, yet practitioners who need GE parameters from an underlying Gaussian fading model must typically simulate the mapping or invoke continuous-time level-crossing approximations that do not yield discrete-slot transition probabilities in closed form. This paper provides an exact, closed-form bridge. By thresholding the Gaussian process at discrete slot boundaries, we derive the GE transition probabilities via Owen's $T$-function for any threshold, reducing to an elementary arcsine identity when the threshold equals the mean. The formulas depend on the covariance kernel only through the one-step correlation coefficient $ρ= K(D)/K(0)$, making them applicable to any stationary Gaussian fading model. The bridge reveals how kernel smoothness governs the resulting link-layer dynamics: the GE persistence time grows linearly in the correlation length $T_c$ for a smooth (squared-exponential) kernel but only as $\sqrt{T_c}$ for a rough (exponential/Ornstein--Uhlenbeck) kernel. We further quantify when the first-order GE chain is a faithful approximation of the full binary process and when it is not, reconciling two diagnostics, the one-step Markov gap and the run-length total-variation distance, that can trend in opposite directions. Monte Carlo simulations validate all theoretical predictions.

SENov 12, 2024
Evaluating ChatGPT-3.5 Efficiency in Solving Coding Problems of Different Complexity Levels: An Empirical Analysis

Minda Li, Bhaskar Krishnamachari

ChatGPT and other large language models (LLMs) promise to revolutionize software development by automatically generating code from program specifications. We assess the performance of ChatGPT's GPT-3.5-turbo model on LeetCode, a popular platform with algorithmic coding challenges for technical interview practice, across three difficulty levels: easy, medium, and hard. We test three main hypotheses. First, ChatGPT solves fewer problems as difficulty rises (Hypothesis 1). Second, prompt engineering improves ChatGPT's performance, with greater gains on easier problems and diminishing returns on harder ones (Hypothesis 2). Third, ChatGPT performs better in popular languages like Python, Java, and C++ than in less common ones like Elixir, Erlang, and Racket (Hypothesis 3). To investigate these hypotheses, we conduct automated experiments using Python scripts to generate prompts that instruct ChatGPT to create Python solutions. These solutions are stored and manually submitted on LeetCode to check their correctness. For Hypothesis 1, results show the GPT-3.5-turbo model successfully solves 92% of easy, 79% of medium, and 51% of hard problems. For Hypothesis 2, prompt engineering yields improvements: 14-29% for Chain of Thought Prompting, 38-60% by providing failed test cases in a second feedback prompt, and 33-58% by switching to GPT-4. From a random subset of problems ChatGPT solved in Python, it also solved 78% in Java, 50% in C++, and none in Elixir, Erlang, or Racket. These findings generally validate all three hypotheses.

NIMay 16, 2024
Smart Routing with Precise Link Estimation: DSEE-Based Anypath Routing for Reliable Wireless Networking

Narjes Nourzad, Bhaskar Krishnamachari

In dynamic and resource-constrained environments, such as multi-hop wireless mesh networks, traditional routing protocols often falter by relying on predetermined paths that prove ineffective in unpredictable link conditions. Shortest Anypath routing offers a solution by adapting routing decisions based on real-time link conditions. However, the effectiveness of such routing is fundamentally dependent on the quality and reliability of the available links, and predicting these variables with certainty is challenging. This paper introduces a novel approach that leverages the Deterministic Sequencing of Exploration and Exploitation (DSEE), a multi-armed bandit algorithm, to address the need for accurate and real-time estimation of link delivery probabilities. This approach augments the reliability and resilience of the Shortest Anypath routing in the face of fluctuating link conditions. By coupling DSEE with Anypath routing, this algorithm continuously learns and ensures accurate delivery probability estimation and selects the most suitable way to efficiently route packets while maintaining a provable near-logarithmic regret bound. We also theoretically prove that our proposed scheme offers better regret scaling with respect to the network size than the previously proposed Thompson Sampling-based Opportunistic Routing (TSOR).

ITMar 8
On the Fluctuations of the Single-Letter $d$-Tilted Sum for Binary Markov Sources

Bhaskar Krishnamachari

The $d$-tilted information has been found to be a useful quantity in finite-blocklength rate-distortion theory for memoryless sources. We study the source-side $d$-tilted sum induced by the single-letter Blahut--Arimoto operating point for a stationary binary Markov source under Hamming distortion; this is a source-side quantity distinct from the $n$-letter operational $d$-tilted information. We show that the centered block sum $J_n(D) - nμ_D$ is exactly an affine image of the occupation count $N_n = \sum_{t=1}^n \mathbf{1}\{X_t = 1\}$ of the Markov chain. As consequences, all centered cumulants are independent of the distortion level~$D$, the finite-$n$ variance admits a closed form, and the exact finite-$n$ distribution and limiting cumulant generating function are given by a $2 \times 2$ transfer matrix.

LGOct 15, 2025
Learning Wireless Interference Patterns: Decoupled GNN for Throughput Prediction in Heterogeneous Multi-Hop p-CSMA Networks

Faezeh Dehghan Tarzjani, Bhaskar Krishnamachari

The p-persistent CSMA protocol is central to random-access MAC analysis, but predicting saturation throughput in heterogeneous multi-hop wireless networks remains a hard problem. Simplified models that assume a single, shared interference domain can underestimate throughput by 48-62% in sparse topologies. Exact Markov-chain analyses are accurate but scale exponentially in computation time, making them impractical for large networks. These computational barriers motivate structural machine learning approaches like GNNs for scalable throughput prediction in general network topologies. Yet off-the-shelf GNNs struggle here: a standard GCN yields 63.94% normalized mean absolute error (NMAE) on heterogeneous networks because symmetric normalization conflates a node's direct interference with higher-order, cascading effects that pertain to how interference propagates over the network graph. Building on these insights, we propose the Decoupled Graph Convolutional Network (D-GCN), a novel architecture that explicitly separates processing of a node's own transmission probability from neighbor interference effects. D-GCN replaces mean aggregation with learnable attention, yielding interpretable, per-neighbor contribution weights while capturing complex multihop interference patterns. D-GCN attains 3.3% NMAE, outperforms strong baselines, remains tractable even when exact analytical methods become computationally infeasible, and enables gradient-based network optimization that achieves within 1% of theoretical optima.

IROct 2, 2025
Cluster-based Adaptive Retrieval: Dynamic Context Selection for RAG Applications

Yifan Xu, Vipul Gupta, Rohit Aggarwal et al.

Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by pulling in external material, document, code, manuals, from vast and ever-growing corpora, to effectively answer user queries. The effectiveness of RAG depends significantly on aligning the number of retrieved documents with query characteristics: narrowly focused queries typically require fewer, highly relevant documents, whereas broader or ambiguous queries benefit from retrieving more extensive supporting information. However, the common static top-k retrieval approach fails to adapt to this variability, resulting in either insufficient context from too few documents or redundant information from too many. Motivated by these challenges, we introduce Cluster-based Adaptive Retrieval (CAR), an algorithm that dynamically determines the optimal number of documents by analyzing the clustering patterns of ordered query-document similarity distances. CAR detects the transition point within similarity distances, where tightly clustered, highly relevant documents shift toward less pertinent candidates, establishing an adaptive cut-off that scales with query complexity. On Coinbase's CDP corpus and the public MultiHop-RAG benchmark, CAR consistently picks the optimal retrieval depth and achieves the highest TES score, outperforming every fixed top-k baseline. In downstream RAG evaluations, CAR cuts LLM token usage by 60%, trims end-to-end latency by 22%, and reduces hallucinations by 10% while fully preserving answer relevance. Since integrating CAR into Coinbase's virtual assistant, we've seen user engagement jump by 200%.

HCJun 4, 2025
Crowd-SFT: Crowdsourcing for LLM Alignment

Alex Sotiropoulos, Sulyab Thottungal Valapu, Linus Lei et al.

Large Language Models (LLMs) increasingly rely on Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) to align model responses with human preferences. While RLHF employs a reinforcement learning approach with a separate reward model, SFT uses human-curated datasets for supervised learning. Both approaches traditionally depend on small, vetted groups of annotators, making them costly, prone to bias, and limited in scalability. We propose an open, crowd-sourced fine-tuning framework that addresses these limitations by enabling broader feedback collection for SFT without extensive annotator training. Our framework promotes incentive fairness via a point-based reward system correlated with Shapley values and guides model convergence through iterative model updates. Our multi-model selection framework demonstrates up to a 55% reduction in target distance over single-model selection, enabling subsequent experiments that validate our point-based reward mechanism's close alignment with Shapley values (a well-established method for attributing individual contributions) thereby supporting fair and scalable participation.

AIApr 29, 2025
AffectEval: A Modular and Customizable Framework for Affective Computing

Emily Zhou, Khushboo Khatri, Yixue Zhao et al.

The field of affective computing focuses on recognizing, interpreting, and responding to human emotions, and has broad applications across education, child development, and human health and wellness. However, developing affective computing pipelines remains labor-intensive due to the lack of software frameworks that support multimodal, multi-domain emotion recognition applications. This often results in redundant effort when building pipelines for different applications. While recent frameworks attempt to address these challenges, they remain limited in reducing manual effort and ensuring cross-domain generalizability. We introduce AffectEval, a modular and customizable framework to facilitate the development of affective computing pipelines while reducing the manual effort and duplicate work involved in developing such pipelines. We validate AffectEval by replicating prior affective computing experiments, and we demonstrate that our framework reduces programming effort by up to 90%, as measured by the reduction in raw lines of code.

LGJun 10, 2024
GKAN: Graph Kolmogorov-Arnold Networks

Mehrdad Kiamari, Mohammad Kiamari, Bhaskar Krishnamachari

We introduce Graph Kolmogorov-Arnold Networks (GKAN), an innovative neural network architecture that extends the principles of the recently proposed Kolmogorov-Arnold Networks (KAN) to graph-structured data. By adopting the unique characteristics of KANs, notably the use of learnable univariate functions instead of fixed linear weights, we develop a powerful model for graph-based learning tasks. Unlike traditional Graph Convolutional Networks (GCNs) that rely on a fixed convolutional architecture, GKANs implement learnable spline-based functions between layers, transforming the way information is processed across the graph structure. We present two different ways to incorporate KAN layers into GKAN: architecture 1 -- where the learnable functions are applied to input features after aggregation and architecture 2 -- where the learnable functions are applied to input features before aggregation. We evaluate GKAN empirically using a semi-supervised graph learning task on a real-world dataset (Cora). We find that architecture generally performs better. We find that GKANs achieve higher accuracy in semi-supervised learning tasks on graphs compared to the traditional GCN model. For example, when considering 100 features, GCN provides an accuracy of 53.5 while a GKAN with a comparable number of parameters gives an accuracy of 61.76; with 200 features, GCN provides an accuracy of 61.24 while a GKAN with a comparable number of parameters gives an accuracy of 67.66. We also present results on the impact of various parameters such as the number of hidden nodes, grid-size, and the polynomial-degree of the spline on the performance of GKAN.

AIJan 26, 2024
CAREForMe: Contextual Multi-Armed Bandit Recommendation Framework for Mental Health

Sheng Yu, Narjes Nourzad, Randye J. Semple et al.

The COVID-19 pandemic has intensified the urgency for effective and accessible mental health interventions in people's daily lives. Mobile Health (mHealth) solutions, such as AI Chatbots and Mindfulness Apps, have gained traction as they expand beyond traditional clinical settings to support daily life. However, the effectiveness of current mHealth solutions is impeded by the lack of context-awareness, personalization, and modularity to foster their reusability. This paper introduces CAREForMe, a contextual multi-armed bandit (CMAB) recommendation framework for mental health. Designed with context-awareness, personalization, and modularity at its core, CAREForMe harnesses mobile sensing and integrates online learning algorithms with user clustering capability to deliver timely, personalized recommendations. With its modular design, CAREForMe serves as both a customizable recommendation framework to guide future research, and a collaborative platform to facilitate interdisciplinary contributions in mHealth research. We showcase CAREForMe's versatility through its implementation across various platforms (e.g., Discord, Telegram) and its customization to diverse recommendation features.

STJan 16, 2024
Forecasting Cryptocurrency Staking Rewards

Sauren Gupta, Apoorva Hathi Katharaki, Yifan Xu et al.

This research explores a relatively unexplored area of predicting cryptocurrency staking rewards, offering potential insights to researchers and investors. We investigate two predictive methodologies: a) a straightforward sliding-window average, and b) linear regression models predicated on historical data. The findings reveal that ETH staking rewards can be forecasted with an RMSE within 0.7% and 1.1% of the mean value for 1-day and 7-day look-aheads respectively, using a 7-day sliding-window average approach. Additionally, we discern diverse prediction accuracies across various cryptocurrencies, including SOL, XTZ, ATOM, and MATIC. Linear regression is identified as superior to the moving-window average for perdicting in the short term for XTZ and ATOM. The results underscore the generally stable and predictable nature of staking rewards for most assets, with MATIC presenting a noteworthy exception.

AIDec 23, 2023
Reinforcement Learning for Safe Occupancy Strategies in Educational Spaces during an Epidemic

Elizabeth Akinyi Ondula, Bhaskar Krishnamachari

Epidemic modeling, encompassing deterministic and stochastic approaches, is vital for understanding infectious diseases and informing public health strategies. This research adopts a prescriptive approach, focusing on reinforcement learning (RL) to develop strategies that balance minimizing infections with maximizing in-person interactions in educational settings. We introduce SafeCampus , a novel tool that simulates infection spread and facilitates the exploration of various RL algorithms in response to epidemic challenges. SafeCampus incorporates a custom RL environment, informed by stochastic epidemic models, to realistically represent university campus dynamics during epidemics. We evaluate Q-learning for a discretized state space which resulted in a policy matrix that not only guides occupancy decisions under varying epidemic conditions but also illustrates the inherent trade-off in epidemic management. This trade-off is characterized by the dilemma between stricter measures, which may effectively reduce infections but impose less educational benefit (more in-person interactions), and more lenient policies, which could lead to higher infection rates.

LGNov 15, 2021
Federated Learning for Internet of Things: Applications, Challenges, and Opportunities

Tuo Zhang, Lei Gao, Chaoyang He et al.

Billions of IoT devices will be deployed in the near future, taking advantage of faster Internet speed and the possibility of orders of magnitude more endpoints brought by 5G/6G. With the growth of IoT devices, vast quantities of data that may contain users' private information will be generated. The high communication and storage costs, mixed with privacy concerns, will increasingly challenge the traditional ecosystem of centralized over-the-cloud learning and processing for IoT platforms. Federated Learning (FL) has emerged as the most promising alternative approach to this problem. In FL, training data-driven machine learning models is an act of collaboration between multiple clients without requiring the data to be brought to a central point, hence alleviating communication and storage costs and providing a great degree of user-level privacy. However, there are still some challenges existing in the real FL system implementation on IoT networks. In this paper, we will discuss the opportunities and challenges of FL in IoT platforms, as well as how it can enable diverse IoT applications. In particular, we identify and discuss seven critical challenges of FL in IoT platforms and highlight some recent promising approaches towards addressing them.

DCOct 22, 2021
GCNScheduler: Scheduling Distributed Computing Applications using Graph Convolutional Networks

Mehrdad Kiamari, Bhaskar Krishnamachari

We consider the classical problem of scheduling task graphs corresponding to complex applications on distributed computing systems. A number of heuristics have been previously proposed to optimize task scheduling with respect to metrics such as makespan and throughput. However, they tend to be slow to run, particularly for larger problem instances, limiting their applicability in more dynamic systems. Motivated by the goal of solving these problems more rapidly, we propose, for the first time, a graph convolutional network-based scheduler (GCNScheduler). By carefully integrating an inter-task data dependency structure with network settings into an input graph and feeding it to an appropriate GCN, the GCNScheduler can efficiently schedule tasks of complex applications for a given objective. We evaluate our scheme with baselines through simulations. We show that not only can our scheme quickly and efficiently learn from existing scheduling schemes, but also it can easily be applied to large-scale settings where current scheduling schemes fail to handle. We show that it achieves better makespan than the classic HEFT algorithm, and almost the same throughput as throughput-oriented HEFT (TP-HEFT), while providing several orders of magnitude faster scheduling times in both cases. For example, for makespan minimization, GCNScheduler schedules 50-node task graphs in about 4 milliseconds while HEFT takes more than 1500 seconds; and for throughput maximization, GCNScheduler schedules 100-node task graphs in about 3.3 milliseconds, compared to about 6.9 seconds for TP-HEFT.

CROct 5, 2021
Dataset: Large-scale Urban IoT Activity Data for DDoS Attack Emulation

Arvin Hekmati, Eugenio Grippo, Bhaskar Krishnamachari

As IoT deployments grow in scale for applications such as smart cities, they face increasing cyber-security threats. In particular, as evidenced by the famous Mirai incident and other ongoing threats, large-scale IoT device networks are particularly susceptible to being hijacked and used as botnets to launch distributed denial of service (DDoS) attacks. Real large-scale datasets are needed to train and evaluate the use of machine learning algorithms such as deep neural networks to detect and defend against such DDoS attacks. We present a dataset from an urban IoT deployment of 4060 nodes describing their spatio-temporal activity under benign conditions. We also provide a synthetic DDoS attack generator that injects attack activity into the dataset based on tunable parameters such as number of nodes attacked and duration of attack. We discuss some of the features of the dataset. We also demonstrate the utility of the dataset as well as our synthetic DDoS attack generator by using them for the training and evaluation of a simple multi-label feed-forward neural network that aims to identify which nodes are under attack and when.

GNMar 1, 2021
Reducing the Volatility of Cryptocurrencies -- A Survey of Stablecoins

Ayten Kahya, Bhaskar Krishnamachari, Seokgu Yun

In the wake of financial crises, stablecoins are gaining adoption among digital currencies. We discuss how stablecoins help reduce the volatility of cryptocurrencies by surveying different types of stablecoins and their stability mechanisms. We classify different approaches to stablecoins in three main categories i) fiat or asset backed, ii) crypto-collateralized and iii) algorithmic stablecoins, giving examples of concrete projects in each class. We assess the relative tradeoffs between the different approaches. We also discuss challenges associated with the future of stablecoins and their adoption, their adoption and point out future research directions.

ROJul 8, 2020
TEAM: Trilateration for Exploration and Mapping with Robotic Networks

Lillian Clark, Charles Andre, Joseph Galante et al.

Motivated by lunar exploration, we consider deploying a network of mobile robots to explore an unknown environment while acting as a cooperative positioning system. Robots measure and communicate position-related data in order to perform localization in the absence of infrastructure-based solutions (e.g. stationary beacons or GPS). We present Trilateration for Exploration and Mapping (TEAM), a novel algorithm for low-complexity localization and mapping with robotic networks. TEAM is designed to leverage the capability of commercially-available ultra-wideband (UWB) radios on board the robots to provide range estimates with centimeter accuracy and perform anchorless localization in a shared, stationary frame. It is well-suited for feature-deprived environments, where feature-based localization approaches suffer. We provide experimental results in varied Gazebo simulation environments as well as on a testbed of Turtlebot3 Burgers with Pozyx UWB radios. We compare TEAM to the popular Rao-Blackwellized Particle Filter for Simultaneous Localization and Mapping (SLAM). We demonstrate that TEAM requires an order of magnitude less computational complexity and reduces the necessary sample rate of LiDAR measurements by an order of magnitude. These advantages do not require sacrificing performance, as TEAM reduces the maximum localization error by 50% and achieves up to a 28% increase in map accuracy in feature-deprived environments and comparable map accuracy in other settings.

CRApr 10, 2020
CONTAIN: Privacy-oriented Contact Tracing Protocols for Epidemics

Arvin Hekmati, Gowri Ramachandran, Bhaskar Krishnamachari

Pandemic and epidemic diseases such as CoVID-19, SARS-CoV2, and Ebola have spread to multiple countries and infected thousands of people. Such diseases spread mainly through person-to-person contacts. Health care authorities recommend contact tracing procedures to prevent the spread to a vast population. Although several mobile applications have been developed to trace contacts, they typically require collection of privacy-intrusive information such as GPS locations, and the logging of privacy-sensitive data on a third party server, or require additional infrastructure such as WiFi APs with known locations. In this paper, we introduce CONTAIN, a privacy-oriented mobile contact tracing application that does not rely on GPS or any other form of infrastructure-based location sensing, nor the continuous logging of any other personally identifiable information on a server. The goal of CONTAIN is to allow users to determine with complete privacy if they have been within a short distance, specifically, Bluetooth wireless range, of someone that is infected, and potentially also when. We identify and prove the privacy guarantees provided by our approach. Our simulation study utilizing an empirical trace dataset (Asturies) involving 100 mobile devices and around 60000 records shows that users can maximize their possibility of identifying if they were near an infected user by turning on the app during active times.

SEMay 25, 2019
A Reference Architecture for Blockchain-based Peer-to-Peer IoT Applications

Gowri Sankar Ramachandran, Bhaskar Krishnamachari

The advent of Blockchain and Distributed Ledger Technologies enable IoT and smart city application developers to conceive new types of applications and solutions for identity management, trust, and data monetization. However, architecting blockchain-based IoT applications remain challenging due to the heterogeneous nature of blockchain platforms and lack of guidelines on how to interface existing components in the IoT ecosystem with the emerging Blockchain technology. This article explains the characteristics of blockchain and IoT technologies and presents a general reference architecture that can be used to develop many blockchain-based peer-to-peer IoT applications.

GTNov 23, 2018
Enhancing Engagement in Token-Curated Registries via an Inflationary Mechanism

Yi Lucy Wang, Bhaskar Krishnamachari

Token Curated Registries (TCR) are decentralized recommendation systems that can be implemented using Blockchain smart contracts. They allow participants to vote for or against adding items to a list through a process that involves staking tokens intrinsic to the registry, with winners receiving the staked tokens for each vote. A TCR aims to provide incentives to create a well-curated list. In this work, we consider a challenge for these systems - incentivizing token-holders to actually engage and participate in the voting process. We propose a novel token-inflation mechanism for enhancing engagement, whereby only voting participants see their token supply increased by a pre-defined multiple after each round of voting. To evaluate this proposal, we propose a simple 4-class model of voters that captures all possible combinations of two key dimensions: whether they are engaged (likely to vote at all for a given item) or disengaged, and whether they are informed (likely to vote in a way that increases the quality of the list) or uninformed, and a simple metric to evaluate the quality of the list as a function of the vote outcomes. We conduct simulations using this model of voters and show that implementing token-inflation results in greater wealth accumulation for engaged voters. In particular, when the number of informed voters is sufficiently high, our simulations show that voters that are both informed and engaged see the greatest benefits from participating in the registry when our proposed token-inflation mechanism is employed. We further validate this finding using a simplified mathematical analysis.

CRJun 21, 2018
Solving the Buyer and Seller's Dilemma: A Dual-Deposit Escrow Smart Contract for Provably Cheat-Proof Delivery and Payment for a Digital Good without a Trusted Mediator

Aditya Asgaonkar, Bhaskar Krishnamachari

A fundamental problem for electronic commerce is the buying and selling of digital goods between individuals that may not know or trust each other. Traditionally, this problem has been addressed by the use of trusted third-parties such as credit-card companies, mediated escrows, legal adjudication, or reputation systems. Despite the rise of blockchain protocols as a way to send payments without trusted third parties, the important problem of exchanging a digital good for payment without trusted third parties has been paid much less attention. We refer to this problem as the Buyer and Seller's Dilemma and present for it a dual-deposit escrow trade protocol which uses double-sided payment deposits in conjunction with simple cryptographic primitives, and that can be implemented using a blockchain-based smart contract. We analyze our protocol as an extensive-form game and prove that the Sub-game Perfect Nash Equilibrium for this game is for both the buyer and seller to cooperate and behave honestly. We address this problem under the assumption that the digital good being traded is known and verifiable, with a fixed price known to both parties.

DCMay 8, 2018
Blockchain for the IoT: Opportunities and Challenges

Gowri Sankar Ramachandran, Bhaskar Krishnamachari

Blockchain technology has been transforming the financial industry and has created a new crypto-economy in the last decade. The foundational concepts such as decentralized trust and distributed ledger are promising for distributed, and large-scale Internet of Things (IoT) applications. However, the applications of Blockchain beyond cryptocurrencies in this domain are few and far between because of the lack of understanding and inherent architectural challenges. In this paper, we describe the opportunities for applications of blockchain for the IoT and examine the challenges involved in architecting Blockchain-based IoT applications.

ITMar 23, 2018
SENATE: A Permissionless Byzantine Consensus Protocol in Wireless Networks

Zhiyuan Jiang, Bhaskar Krishnamachari, Sheng Zhou et al.

The blockchain technology has achieved tremendous success in open (permissionless) decentralized consensus by employing proof-of-work (PoW) or its variants, whereby unauthorized nodes cannot gain disproportionate impact on consensus beyond their computational power. However, PoW-based systems incur a high delay and low throughput, making them ineffective in dealing with real-time applications. On the other hand, byzantine fault-tolerant (BFT) consensus algorithms with better delay and throughput performance have been employed in closed (permissioned) settings to avoid Sybil attacks. In this paper, we present Sybil-proof wirelEss Network coordinAte based byzanTine consEnsus (SENATE), which is based on the conventional BFT consensus framework yet works in open systems of wireless devices where faulty nodes may launch Sybil attacks. As in a Senate in the legislature where the quota of senators per state (district) is a constant irrespective with the population of the state, "senators" in SENATE are selected from participating distributed nodes based on their wireless network coordinates (WNC) with a fixed number of nodes per district in the WNC space. Elected senators then participate in the subsequent consensus reaching process and broadcast the result. Thereby, SENATE is proof against Sybil attacks since pseudonyms of a faulty node are likely to be adjacent in the WNC space and hence fail to be elected.