LGDec 7, 2025
Know your Trajectory -- Trustworthy Reinforcement Learning deployment through Importance-Based Trajectory AnalysisClifford F, Devika Jay, Abhishek Sarkar et al.
As Reinforcement Learning (RL) agents are increasingly deployed in real-world applications, ensuring their behavior is transparent and trustworthy is paramount. A key component of trust is explainability, yet much of the work in Explainable RL (XRL) focuses on local, single-step decisions. This paper addresses the critical need for explaining an agent's long-term behavior through trajectory-level analysis. We introduce a novel framework that ranks entire trajectories by defining and aggregating a new state-importance metric. This metric combines the classic Q-value difference with a "radical term" that captures the agent's affinity to reach its goal, providing a more nuanced measure of state criticality. We demonstrate that our method successfully identifies optimal trajectories from a heterogeneous collection of agent experiences. Furthermore, by generating counterfactual rollouts from critical states within these trajectories, we show that the agent's chosen path is robustly superior to alternatives, thereby providing a powerful "Why this, and not that?" explanation. Our experiments in standard OpenAI Gym environments validate that our proposed importance metric is more effective at identifying optimal behaviors compared to classic approaches, offering a significant step towards trustworthy autonomous systems.
LGDec 21, 2022
LogAnMeta: Log Anomaly Detection Using Meta LearningAbhishek Sarkar, Tanmay Sen, Srimanta Kundu et al.
Modern telecom systems are monitored with performance and system logs from multiple application layers and components. Detecting anomalous events from these logs is key to identify security breaches, resource over-utilization, critical/fatal errors, etc. Current supervised log anomaly detection frameworks tend to perform poorly on new types or signatures of anomalies with few or unseen samples in the training data. In this work, we propose a meta-learning-based log anomaly detection framework (LogAnMeta) for detecting anomalies from sequence of log events with few samples. LoganMeta train a hybrid few-shot classifier in an episodic manner. The experimental results demonstrate the efficacy of our proposed method
SPDec 30, 2023
Machine Learning (ML)-assisted Beam Management in millimeter (mm)Wave Distributed Multiple Input Multiple Output (D-MIMO) systemsKarthik R M, Dhiraj Nagaraja Hegde, Muris Sarajlic et al.
Beam management (BM) protocols are critical for establishing and maintaining connectivity between network radio nodes and User Equipments (UEs). In Distributed Multiple Input Multiple Output systems (D-MIMO), a number of access points (APs), coordinated by a central processing unit (CPU), serves a number of UEs. At mmWave frequencies, the problem of finding the best AP and beam to serve the UEs is challenging due to a large number of beams that need to be sounded with Downlink (DL) reference signals. The objective of this paper is to investigate whether the best AP/beam can be reliably inferred from sounding only a small subset of beams and leveraging AI/ML for inference of best beam/AP. We use Random Forest (RF), MissForest (MF) and conditional Generative Adversarial Networks (c-GAN) for demonstrating the performance benefits of inference.
ROJun 7, 2021
Terrain Adaptive Gait Transitioning for a Quadruped Robot using Model Predictive ControlPrathamesh Saraf, Abhishek Sarkar, Arshad Javed
Legged robots can traverse challenging terrain, use perception to plan their safe foothold positions, and navigate the environment. Such unique mobility capabilities make these platforms a perfect candidate for scenarios such as search and rescue, inspection, and exploration tasks. While traversing through such terrains, the robot's instability is a significant concern. Many times the robot needs to switch gaits depending on its environment. Due to the complex dynamics of quadruped robots, classical PID control fails to provide high stability. Thus, there is a need for advanced control methods like the Model Predictive Control (MPC) which uses the system model and the nature of the terrain in order to predict the stable body pose of the robot. The controller also provides correction to any external disturbances that result in a change in the desired behavior of the robot. The MPC controller is designed in MATLAB, for full body torque control. The controller performance was verified on Boston Dynamics Spot in Webots simulator. The robot is able to provide correction for external perturbations up to 150 N and also resist falls till 80 cm.
MLMay 27, 2021
Non-negative matrix factorization algorithms generally improve topic model fitsPeter Carbonetto, Abhishek Sarkar, Zihao Wang et al.
In an effort to develop topic modeling methods that can be quickly applied to large data sets, we revisit the problem of maximum-likelihood estimation in topic models. It is known, at least informally, that maximum-likelihood estimation in topic models is closely related to non-negative matrix factorization (NMF). Yet, to our knowledge, this relationship has not been exploited previously to fit topic models. We show that recent advances in NMF optimization methods can be leveraged to fit topic models very efficiently, often resulting in much better fits and in less time than existing algorithms for topic models. We also formally make the connection between the NMF optimization problem and maximum-likelihood estimation for the topic model, and using this result we show that the expectation maximization (EM) algorithm for the topic model is essentially the same as the classic multiplicative updates for NMF (the only difference being that the operations are performed in a different order). Our methods are implemented in the R package fastTopics.
ROFeb 19, 2020
Omnidirectional Three Module Robot Design and SimulationKartik Suryavanshi, Rama Vadapalli, Praharsha Budharaja et al.
This paper introduces the Omnidirectional Tractable Three Module Robot for traversing inside complex pipe networks. The robot consists of three omnidirectional modules fixed 120° apart circumferentially which can rotate about their axis allowing holonomic motion of the robot. Holonomic motion enables the robot to overcome motion singularity when negotiating T-junctions and further allows the robot to arrive in a preferred orientation while taking turns inside a pipe. The singularity region while negotiating T-junctions is analyzed to formulate the geometry of the region. The design and motion capabilities are validated by conducting simulations in MSC ADAMS on a simplified lumped-model of the robot.
ROSep 23, 2019
Omnidirectional Tractable Three Module RobotKartik Suryavanshi, Rama Vadapalli, Ruchitha Vucha et al.
This paper introduces the Omnidirectional Tractable Three Module Robot for traversing inside complex pipe networks. The robot consists of three omnidirectional modules fixed 120° apart circumferentially which can rotate about their own axis allowing holonomic motion of the robot. The holonomic motion enables the robot to overcome motion singularity when negotiating T-junctions and further allows the robot to arrive in a preferred orientation while taking turns inside a pipe. We have developed a closed-form kinematic model for the robot in the paper and propose the Motion Singularity Region that the robot needs to avoid while negotiating T-junction. The design and motion capabilities of the robot are demonstrated both by conducting simulations in MSC ADAMS on a simplified lumped-model of the robot and with experiments on its physical embodiment.
ROSep 23, 2019
Modular Pipe ClimberRama Vadapalli, Kartik Suryavanshi, Ruchita Vucha et al.
This paper discusses the design and implementation of the Modular Pipe Climber inside ASTM D1785 - 15e1 standard pipes [1]. The robot has three tracks which operate independently and are mounted on three modules which are oriented at 120° to each other. The tracks provide for greater surface traction compared to wheels [2]. The tracks are pushed onto the inner wall of the pipe by passive springs which help in maintaining the contact with the pipe during vertical climb and while turning in bends. The modules have the provision to compress asymmetrically, which helps the robot to take turns in bends in all directions. The motor torque required by the robot and the desired spring stiffness are calculated at quasistatic and static equilibriums when the pipe climber is in a vertical climb. The springs were further simulated and analyzed in ADAMS MSC. The prototype built based on these obtained values was experimented on, in complex pipe networks. Differential speed is employed when turning in bends to improve the efficiency and reduce the stresses experienced by the robot.
MLJan 24, 2019
Causal Mediation Analysis Leveraging Multiple Types of Summary Statistics DataYongjin Park, Abhishek Sarkar, Khoi Nguyen et al.
Summary statistics of genome-wide association studies (GWAS) teach causal relationship between millions of genetic markers and tens and thousands of phenotypes. However, underlying biological mechanisms are yet to be elucidated. We can achieve necessary interpretation of GWAS in a causal mediation framework, looking to establish a sparse set of mediators between genetic and downstream variables, but there are several challenges. Unlike existing methods rely on strong and unrealistic assumptions, we tackle practical challenges within a principled summary-based causal inference framework. We analyzed the proposed methods in extensive simulations generated from real-world genetic data. We demonstrated only our approach can accurately redeem causal genes, even without knowing actual individual-level data, despite the presence of competing non-causal trails.
ROMay 9, 2018
Learning Coordinated Tasks using Reinforcement Learning in HumanoidsS Phaniteja, Parijat Dewangan, Pooja Guhan et al.
With the advent of artificial intelligence and machine learning, humanoid robots are made to learn a variety of skills which humans possess. One of fundamental skills which humans use in day-to-day activities is performing tasks with coordination between both the hands. In case of humanoids, learning such skills require optimal motion planning which includes avoiding collisions with the surroundings. In this paper, we propose a framework to learn coordinated tasks in cluttered environments based on DiGrad - A multi-task reinforcement learning algorithm for continuous action-spaces. Further, we propose an algorithm to smooth the joint space trajectories obtained by the proposed framework in order to reduce the noise instilled during training. The proposed framework was tested on a 27 degrees of freedom (DoF) humanoid with articulated torso for performing coordinated object-reaching task with both the hands in four different environments with varying levels of difficulty. It is observed that the humanoid is able to plan collision free trajectory in real-time. Simulation results also reveal the usefulness of the articulated torso for performing tasks which require coordination between both the arms.
LGFeb 27, 2018
DiGrad: Multi-Task Reinforcement Learning with Shared ActionsParijat Dewangan, S Phaniteja, K Madhava Krishna et al.
Most reinforcement learning algorithms are inefficient for learning multiple tasks in complex robotic systems, where different tasks share a set of actions. In such environments a compound policy may be learnt with shared neural network parameters, which performs multiple tasks concurrently. However such compound policy may get biased towards a task or the gradients from different tasks negate each other, making the learning unstable and sometimes less data efficient. In this paper, we propose a new approach for simultaneous training of multiple tasks sharing a set of common actions in continuous action spaces, which we call as DiGrad (Differential Policy Gradient). The proposed framework is based on differential policy gradients and can accommodate multi-task learning in a single actor-critic network. We also propose a simple heuristic in the differential policy gradient update to further improve the learning. The proposed architecture was tested on 8 link planar manipulator and 27 degrees of freedom(DoF) Humanoid for learning multi-goal reachability tasks for 3 and 2 end effectors respectively. We show that our approach supports efficient multi-task learning in complex robotic systems, outperforming related methods in continuous action spaces.
ROJan 31, 2018
A Deep Reinforcement Learning Approach for Dynamically Stable Inverse Kinematics of Humanoid RobotsS Phaniteja, Parijat Dewangan, Pooja Guhan et al.
Real time calculation of inverse kinematics (IK) with dynamically stable configuration is of high necessity in humanoid robots as they are highly susceptible to lose balance. This paper proposes a methodology to generate joint-space trajectories of stable configurations for solving inverse kinematics using Deep Reinforcement Learning (RL). Our approach is based on the idea of exploring the entire configuration space of the robot and learning the best possible solutions using Deep Deterministic Policy Gradient (DDPG). The proposed strategy was evaluated on the highly articulated upper body of a humanoid model with 27 degree of freedom (DoF). The trained model was able to solve inverse kinematics for the end effectors with 90% accuracy while maintaining the balance in double support phase.
ROSep 29, 2017
CObRaSO: Compliant Omni-Direction Bendable Hybrid Rigid and Soft OmniCrawler ModuleEnna Sachdeva, Akash Singh, Vinay Rodrigues et al.
This paper presents a novel design of an Omnidirectional bendable Omnicrawler module- CObRaSO. Along with the longitudinal crawling and sideways rolling motion, the performance of the OmniCrawler is further enhanced by the introduction of Omnidirectional bending within the module, which is the key contribution of this paper. The Omnidirectional bending is achieved by an arrangement of two independent 1-DOF joints aligned at 90? w.r.t each other. The unique characteristic of this module is its ability to crawl in Omnidirectionally bent configuration which is achieved by a novel design of a 2-DOF roller chain and a backbone of a hybrid structure of a soft-rigid material. This hybrid structure provides compliant pathways for the lug-chain assembly to passively conform with the orientation of the module and crawl in Omnidirectional bent configuration, which makes this module one of its kind. Furthermore, we show that the unique modular design of CObRaSO unveils its versatility by achieving active compliance on an uneven surface, demonstrating its applications in different robotic platforms (an in-pipeline robot, Quadruped and snake robot) and exhibiting hybrid locomotion modes in various configurations of the robots. The mechanism and mobility characteristics of the proposed module have been verified with the aid of simulations and experiments on real robot prototype.
ROJun 19, 2017
Design and optimal springs stiffness estimation of a Modular OmniCrawler in-pipe climbing RobotAkash Singh, Enna Sachdeva, Abhishek Sarkar et al.
This paper discusses the design of a novel compliant in-pipe climbing modular robot for small diameter pipes. The robot consists of a kinematic chain of 3 OmniCrawler modules with a link connected in between 2 adjacent modules via compliant joints. While the tank-like crawler mechanism provides good traction on low friction surfaces, its circular cross-section makes it holonomic. The holonomic motion assists it to re-align in a direction to avoid obstacles during motion as well as overcome turns with a minimal energy posture. Additionally, the modularity enables it to negotiate T-junction without motion singularity. The compliance is realized using 4 torsion springs incorporated in joints joining 3 modules with 2 links. For a desirable pipe diameter (\textØ 75mm), the springs' stiffness values are obtained by formulating a constraint optimization problem which has been simulated in ADAMS MSC and further validated on a real robot prototype. In order to negotiate smooth vertical bends and friction coefficient variations in pipes, the design was later modified by replacing springs with series elastic actuators (SEA) at 2 of the 4 joints.
ROApr 22, 2017
COCrIP: Compliant OmniCrawler In-pipeline RobotAkash Singh, Enna Sachdeva, Abhishek Sarkar et al.
This paper presents a modular in-pipeline climbing robot with a novel compliant foldable OmniCrawler mechanism. The circular cross-section of the OmniCrawler module enables a holonomic motion to facilitate the alignment of the robot in the direction of bends. Additionally, the crawler mechanism provides a fair amount of traction, even on slippery surfaces. These advantages of crawler modules have been further supplemented by incorporating active compliance in the module itself which helps to negotiate sharp bends in small diameter pipes. The robot has a series of 3 such compliant foldable modules interconnected by the links via passive joints. For the desirable pipe diameter and curvature of the bends, the spring stiffness value for each passive joint is determined by formulating a constrained optimization problem using the quasi-static model of the robot. Moreover, a minimum friction coefficient value between the module-pipe surface which can be vertically climbed by the robot without slipping is estimated. The numerical simulation results have further been validated by experiments on real robot prototype.