Sriram Subramanian

LG
6papers
66citations
Novelty36%
AI Score41

6 Papers

ROMay 28
Decentralized LLM-Driven Coordination of Acoustic Robots for Contactless Object Manipulation

Yingying Wang, Narsimlu Kemsaram, Sriram Subramanian

Natural language interfaces can simplify interaction with multi-robot systems, especially when non-expert users need to issue high-level commands. Acoustic manipulation using ultrasonic phased arrays also enables contactless object handling for applications such as healthcare, laboratory automation, and precision transport. However, combining large language models (LLMs) with distributed acoustic mobile robots remains underexplored. This paper presents a decentralized framework for natural language-driven coordination of acoustic robots for contactless object manipulation. The system converts spoken instructions into executable multi-robot task plans using Whisper-based speech recognition, LLM-based semantic parsing, structured JSON task representation, and distributed scheduling. The JSON schema encodes robot assignments, temporal dependencies, spatial constraints, and synchronization requirements for sequential, parallel, and synchronized execution. The system is implemented on two TurtleBot3-based acoustic robots, each equipped with an ultrasonic phased array for contactless object transport. Experiments were conducted in three scenarios: sequential execution, parallel multi-robot transport, and synchronized cooperative manipulation. The system achieved task success rates of 96 percent for sequential tasks, 86 percent for parallel execution, and 70 percent for synchronized collaborative transport. These results show that natural language commands can be transformed into distributed robot actions for contactless manipulation, highlighting the potential of LLM-driven automation for human-robot interaction in distributed robotic systems.

ROApr 21
A Gesture-Based Visual Learning Model for Acoustophoretic Interactions using a Swarm of AcoustoBots

Alex Lin, Lei Gao, Narsimlu Kemsaram et al.

AcoustoBots are mobile acoustophoretic robots capable of delivering mid-air haptics, directional audio, and acoustic levitation, but existing implementations rely on scripted commands and lack an intuitive interface for real-time human control. This work presents a gesture-based visual learning framework for contactless human-swarm interaction with a multimodal AcoustoBot platform. The system combines ESP32-CAM gesture capture, PhaseSpace motion tracking, centralized processing, and an OpenCLIP-based visual learning model (VLM) with linear probing to classify three hand gestures and map them to haptics, audio, and levitation modalities. Validation accuracy improved from about 67% with a small dataset to nearly 98% with the largest dataset. In integrated experiments with two AcoustoBots, the system achieved an overall gesture-to-modality switching accuracy of 87.8% across 90 trials, with an average end-to-end latency of 3.95 seconds. These results demonstrate the feasibility of using a vision-language-model-based gesture interface for multimodal human-swarm interaction. While the current system is limited by centralized processing, a static gesture set, and controlled-environment evaluation, it establishes a foundation for more expressive, scalable, and accessible swarm robotic interfaces.

GRSep 29, 2021
SliceHub: Augmenting Shared 3D Model Repositories with Slicing Results for 3D Printing

Faraz Faruqi, Kenneth Friedman, Leon Cheng et al.

In this paper, we explore how to augment shared 3D model repositories, such as Thingiverse, with slicing results that are readily available to all users. By having print time and material consumption for different print resolution profiles and model scales available in real-time, users are able to explore different slicing configurations efficiently to find the one that best fits their time and material constraints. To prototype this idea, we build a system called SliceHub, which consists of three components: (1) a repository with an evolving database of 3D models, for which we store the print time and material consumption for various print resolution profiles and model scales, (2) a user interface integrated into an existing slicer that allows users to explore the slicing information from the 3D models, and (3) a computational infrastructure to quickly generate new slicing results, either through parallel slicing of multiple print resolution profiles and model scales or through interpolation. We motivate our work with a formative study of the challenges faced by users of existing slicers and provide a technical evaluation of the SliceHub system.

LGJun 11, 2020
Calendar Graph Neural Networks for Modeling Time Structures in Spatiotemporal User Behaviors

Daheng Wang, Meng Jiang, Munira Syed et al.

User behavior modeling is important for industrial applications such as demographic attribute prediction, content recommendation, and target advertising. Existing methods represent behavior log as a sequence of adopted items and find sequential patterns; however, concrete location and time information in the behavior log, reflecting dynamic and periodic patterns, joint with the spatial dimension, can be useful for modeling users and predicting their characteristics. In this work, we propose a novel model based on graph neural networks for learning user representations from spatiotemporal behavior data. A behavior log comprises a sequence of sessions; and a session has a location, start time, end time, and a sequence of adopted items. Our model's architecture incorporates two networked structures. One is a tripartite network of items, sessions, and locations. The other is a hierarchical calendar network of hour, week, and weekday nodes. It first aggregates embeddings of location and items into session embeddings via the tripartite network, and then generates user embeddings from the session embeddings via the calendar structure. The user embeddings preserve spatial patterns and temporal patterns of a variety of periodicity (e.g., hourly, weekly, and weekday patterns). It adopts the attention mechanism to model complex interactions among the multiple patterns in user behaviors. Experiments on real datasets (i.e., clicks on news articles in a mobile app) show our approach outperforms strong baselines for predicting missing demographic attributes.

LGFeb 22, 2019
MPP: Model Performance Predictor

Sindhu Ghanta, Sriram Subramanian, Lior Khermosh et al.

Operations is a key challenge in the domain of machine learning pipeline deployments involving monitoring and management of real-time prediction quality. Typically, metrics like accuracy, RMSE etc., are used to track the performance of models in deployment. However, these metrics cannot be calculated in production due to the absence of labels. We propose using an ML algorithm, Model Performance Predictor (MPP), to track the performance of the models in deployment. We argue that an ensemble of such metrics can be used to create a score representing the prediction quality in production. This in turn facilitates formulation and customization of ML alerts, that can be escalated by an operations team to the data science team. Such a score automates monitoring and enables ML deployments at scale.

LGFeb 7, 2019
ML Health: Fitness Tracking for Production Models

Sindhu Ghanta, Sriram Subramanian, Lior Khermosh et al.

Deployment of machine learning (ML) algorithms in production for extended periods of time has uncovered new challenges such as monitoring and management of real-time prediction quality of a model in the absence of labels. However, such tracking is imperative to prevent catastrophic business outcomes resulting from incorrect predictions. The scale of these deployments makes manual monitoring prohibitive, making automated techniques to track and raise alerts imperative. We present a framework, ML Health, for tracking potential drops in the predictive performance of ML models in the absence of labels. The framework employs diagnostic methods to generate alerts for further investigation. We develop one such method to monitor potential problems when production data patterns do not match training data distributions. We demonstrate that our method performs better than standard "distance metrics", such as RMSE, KL-Divergence, and Wasserstein at detecting issues with mismatched data sets. Finally, we present a working system that incorporates the ML Health approach to monitor and manage ML deployments within a realistic full production ML lifecycle.