HoME: a Household Multimodal Environment
This provides a scalable and extensible platform for researchers in reinforcement learning, language grounding, and robotics to train agents in interactive, multimodal settings, though it is incremental as it builds on existing datasets like SUNCG.
The authors tackled the problem of creating a realistic multimodal environment for artificial agents by introducing HoME, which integrates over 45,000 diverse 3D house layouts to facilitate learning, generalization, and transfer across vision, audio, semantics, physics, and interaction.
We introduce HoME: a Household Multimodal Environment for artificial agents to learn from vision, audio, semantics, physics, and interaction with objects and other agents, all within a realistic context. HoME integrates over 45,000 diverse 3D house layouts based on the SUNCG dataset, a scale which may facilitate learning, generalization, and transfer. HoME is an open-source, OpenAI Gym-compatible platform extensible to tasks in reinforcement learning, language grounding, sound-based navigation, robotics, multi-agent learning, and more. We hope HoME better enables artificial agents to learn as humans do: in an interactive, multimodal, and richly contextualized setting.