Vision-based Control of a Quadrotor in User Proximity: Mediated vs End-to-End Learning Approaches
This addresses a domain-specific robotics control problem, but the findings are incremental as they show no performance difference between two established paradigms.
The study tackled the problem of controlling a quadrotor to hover in front of a moving user using an onboard camera, comparing mediated and end-to-end learning approaches, and found that both methods yielded equivalent performance on this specific task.
We consider the task of controlling a quadrotor to hover in front of a freely moving user, using input data from an onboard camera. On this specific task we compare two widespread learning paradigms: a mediated approach, which learns an high-level state from the input and then uses it for deriving control signals; and an end-to-end approach, which skips high-level state estimation altogether. We show that despite their fundamental difference, both approaches yield equivalent performance on this task. We finally qualitatively analyze the behavior of a quadrotor implementing such approaches.