Analyzing Visual Representations in Embodied Navigation Tasks
This addresses the issue of representation specialization in embodied AI, providing insights for researchers, but it is incremental as it builds on existing analysis methods.
The paper tackled the problem of understanding why deep reinforcement learning representations become overly specialized to specific tasks, finding that slight task differences do not measurably affect visual representations in embodied navigation, and demonstrating effective transfer between tasks.
Recent advances in deep reinforcement learning require a large amount of training data and generally result in representations that are often over specialized to the target task. In this work, we present a methodology to study the underlying potential causes for this specialization. We use the recently proposed projection weighted Canonical Correlation Analysis (PWCCA) to measure the similarity of visual representations learned in the same environment by performing different tasks. We then leverage our proposed methodology to examine the task dependence of visual representations learned on related but distinct embodied navigation tasks. Surprisingly, we find that slight differences in task have no measurable effect on the visual representation for both SqueezeNet and ResNet architectures. We then empirically demonstrate that visual representations learned on one task can be effectively transferred to a different task.