ROCVLGIVDec 9, 2025

Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic Teleoperation

arXiv:2512.08271v1
Originality Synthesis-oriented
AI Analysis

This addresses the need for interaction-centric teleoperation in robotics, though it appears incremental as it combines existing components like vision-language segmentation and 3D Gaussian Splatting.

The authors tackled the problem of enabling multilateral teleoperation by transforming CCTV streams into a shared 6-DoF world model, achieving real-time global pose estimation for multiple robots without fiducials or depth sensors.

We introduce Zero-Splat TeleAssist, a zero-shot sensor-fusion pipeline that transforms commodity CCTV streams into a shared, 6-DoF world model for multilateral teleoperation. By integrating vision-language segmentation, monocular depth, weighted-PCA pose extraction, and 3D Gaussian Splatting (3DGS), TeleAssist provides every operator with real-time global positions and orientations of multiple robots without fiducials or depth sensors in an interaction-centric teleoperation setup.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes