IRIS: Learning-Driven Task-Specific Cinema Robot Arm for Visuomotor Motion Control
This addresses the problem of limited adoption of robotic camera systems in cinema due to cost and complexity, offering a low-cost, learning-based alternative.
The authors tackled the high cost and complexity of robotic camera systems by developing IRIS, a 6-DOF manipulator that learns cinematic motions from human demonstrations using a visuomotor imitation learning framework, resulting in a platform costing under $1,000 USD with 1 mm repeatability.
Robotic camera systems enable dynamic, repeatable motion beyond human capabilities, yet their adoption remains limited by the high cost and operational complexity of industrial-grade platforms. We present the Intelligent Robotic Imaging System (IRIS), a task-specific 6-DOF manipulator designed for autonomous, learning-driven cinematic motion control. IRIS integrates a lightweight, fully 3D-printed hardware design with a goal-conditioned visuomotor imitation learning framework based on Action Chunking with Transformers (ACT). The system learns object-aware and perceptually smooth camera trajectories directly from human demonstrations, eliminating the need for explicit geometric programming. The complete platform costs under $1,000 USD, supports a 1.5 kg payload, and achieves approximately 1 mm repeatability. Real-world experiments demonstrate accurate trajectory tracking, reliable autonomous execution, and generalization across diverse cinematic motions.