Leonardo Militano

RO
h-index29
4papers
34citations
Novelty23%
AI Score33

4 Papers

RONov 5, 2025Code
ROSBag MCP Server: Analyzing Robot Data with LLMs for Agentic Embodied AI Applications

Lei Fu, Sahar Salimpour, Leonardo Militano et al.

Agentic AI systems and Physical or Embodied AI systems have been two key research verticals at the forefront of Artificial Intelligence and Robotics, with Model Context Protocol (MCP) increasingly becoming a key component and enabler of agentic applications. However, the literature at the intersection of these verticals, i.e., Agentic Embodied AI, remains scarce. This paper introduces an MCP server for analyzing ROS and ROS 2 bags, allowing for analyzing, visualizing and processing robot data with natural language through LLMs and VLMs. We describe specific tooling built with robotics domain knowledge, with our initial release focused on mobile robotics and supporting natively the analysis of trajectories, laser scan data, transforms, or time series data. This is in addition to providing an interface to standard ROS 2 CLI tools ("ros2 bag list" or "ros2 bag info"), as well as the ability to filter bags with a subset of topics or trimmed in time. Coupled with the MCP server, we provide a lightweight UI that allows the benchmarking of the tooling with different LLMs, both proprietary (Anthropic, OpenAI) and open-source (through Groq). Our experimental results include the analysis of tool calling capabilities of eight different state-of-the-art LLM/VLM models, both proprietary and open-source, large and small. Our experiments indicate that there is a large divide in tool calling capabilities, with Kimi K2 and Claude Sonnet 4 demonstrating clearly superior performance. We also conclude that there are multiple factors affecting the success rates, from the tool description schema to the number of arguments, as well as the number of tools available to the models. The code is available with a permissive license at https://github.com/binabik-ai/mcp-rosbags.

ROOct 8, 2022
Cloud Native Robotic Applications with GPU Sharing on Kubernetes

Giovanni Toffetti, Leonardo Militano, Seán Murphy et al.

In this paper we discuss our experience in teaching the Robotic Applications Programming course at ZHAW combining the use of a Kubernetes (k8s) cluster and real, heterogeneous, robotic hardware. We discuss the main advantages of our solutions in terms of seamless simulation-to-real experience for students and the main shortcomings we encountered with networking and sharing GPUs to support deep learning workloads. We describe the current and foreseen alternatives to avoid these drawbacks in future course editions and propose a more cloud-native approach to deploying multiple robotics applications on a k8s cluster.

ROAug 7, 2025
Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction

Sahar Salimpour, Lei Fu, Kajetan Rachwał et al.

Foundation models, including large language models (LLMs) and vision-language models (VLMs), have recently enabled novel approaches to robot autonomy and human-robot interfaces. In parallel, vision-language-action models (VLAs) or large behavior models (LBMs) are increasing the dexterity and capabilities of robotic systems. This survey paper reviews works that advance agentic applications and architectures, including initial efforts with GPT-style interfaces and more complex systems where AI agents function as coordinators, planners, perception actors, or generalist interfaces. Such agentic architectures allow robots to reason over natural language instructions, invoke APIs, plan task sequences, or assist in operations and diagnostics. In addition to peer-reviewed research, due to the fast-evolving nature of the field, we highlight and include community-driven projects, ROS packages, and industrial frameworks that show emerging trends. We propose a taxonomy for classifying model integration approaches and present a comparative analysis of the role that agents play in different solutions in today's literature.

NIApr 26, 2015
Efficient Spectrum Management Exploiting D2D Communication in 5G Systems

Leonardo Militano, Antonino Orsino, Giuseppe Araniti et al.

In the future standardization of the 5G networks, in Long Term Evolution (LTE) Release 13 and beyond, Device-to-Device communications (D2D) is recognized as one of the key technologies that will support the 5G architecture. In fact, D2D can be exploited for different proximity-based services (ProSe) where the users discover their neighbors and benefit form different services like social applications, advertisement, public safety, and warning messages. In such a scenario, the aim is to manage in a proper way the radio spectrum and the energy consumption to provide high Quality of Experience (QoE) and better Quality of Services (QoS). To reach this goal, in this paper we propose a novel D2D-based uploading scheme in order to decrease the amount of radio resources needed to upload to the eNodeB a certain multimedia content. As a further improvement, the proposed scheme enhances the energy consumption of the users in the network, without affects the content uploading time. The obtained results show that our scheme achieves a gain of about 35\% in term of mean radio resources used with respect to the standard LTE cellular approach. In addition, it is also 40 times more efficient in terms of energy consumption needed to upload the multimedia content.