By scaling parallel environments with GPU simulation, MS-HAB achieves 4000 samples per second on a benchmark involving representative interaction with dynamic objects — 3× Habitat 2.0's implementation at similar GPU memory usage.
MS-HAB environments support realistic low-level control for successful grasping, manipulation, and interaction, while the Habitat 2.0 environments do not support such kind of low-level control.
This means MS-HAB is fast enough to support online training and efficient, extensive evaluation without sacrificing physical realism.