CVDec 3, 2019

Integrating Motion into Vision Models for Better Visual Prediction

arXiv:1912.01661v1
Originality Incremental advance
AI Analysis

This work addresses a specific issue in robotics or computer vision for systems using pan-tilt cameras, but it appears incremental as it builds on prior Predictive Vision Model work.

The paper tackled the problem of impaired visual prediction and feedback effects in vision systems by integrating camera motion into a self-supervised predictive learning model, resulting in improved visual prediction and saccadic behavior.

We demonstrate an improved vision system that learns a model of its environment using a self-supervised, predictive learning method. The system includes a pan-tilt camera, a foveated visual input, a saccading reflex to servo the foveated region to areas high prediction error, input frame transformation synced to the camera motion, and a recursive, hierachical machine learning technique based on the Predictive Vision Model. In earlier work, which did not integrate camera motion into the vision model, prediction was impaired and camera movement suffered from undesired feedback effects. Here we detail the integration of camera motion into the predictive learning system and show improved visual prediction and saccadic behavior. From these experiences, we speculate on the integration of additional sensory and motor systems into self-supervised, predictive learning models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes