Deep Episodic Memory: Encoding, Recalling, and Predicting Episodic Experiences for Robot Action Execution
This addresses the problem of enabling robots to learn from past experiences for action execution, representing an incremental advancement in robotics and AI.
The paper tackles the problem of representing robot experiences in episodic memory by proposing a deep neural network architecture that encodes, recalls, and predicts actions, showing that similar actions map to the same latent space regions and benchmarking performance on datasets like 20BN-something-something and ActivityNet.
We present a novel deep neural network architecture for representing robot experiences in an episodic-like memory which facilitates encoding, recalling, and predicting action experiences. Our proposed unsupervised deep episodic memory model 1) encodes observed actions in a latent vector space and, based on this latent encoding, 2) infers most similar episodes previously experienced, 3) reconstructs original episodes, and 4) predicts future frames in an end-to-end fashion. Results show that conceptually similar actions are mapped into the same region of the latent vector space. Based on these results, we introduce an action matching and retrieval mechanism, benchmark its performance on two large-scale action datasets, 20BN-something-something and ActivityNet and evaluate its generalization capability in a real-world scenario on a humanoid robot.