Читать книгу Cyberphysical Smart Cities Infrastructures - Группа авторов - Страница 46

3.4.2 Habitat

Habitat was designed and built in a way to provide the maximum customizability in terms of the datasets that can be used and how the agents and the environment can be configured. That being said, Habitat works with all the major 3D environment datasets without a problem. Moreover, it is extremely fast in comparison with other simulators. AI2‐THOR and CHALET can get to an fps of roughly 10, MINOS and Gibson can get to around a hundred, and House3D yields 300 fps in the best case, while Habitat is capable of getting up to 10 000 fps. It also provides a more realistic collision model in which if a collision happens, the agent can be moved partially or not at all in the intended direction.

To benchmark Habitat, the owners employed a few naive algorithm baselines, proximal policy optimization (PPO) [81] as the representer of learning algorithms versus ORB‐SLAM2 [82, 83] as the chosen candidate for non‐learning agents, and tested them on the PointGoal Navigation task on Gibson and Matterport3D. They used Success weighted by Path Length (SPL) [84] as the metric for their performance. The PPO agent was tested with different levels of sensors (e.g. no visual sensor, only depth, only RGB, and RGBD) to perform an ablation study and find the proportion in which each sensor helps the progress. SLAM agents were given RGBD sensors in all the episodes.

The authors found out that first, PPO agents with only RGB perform as bad as agents with no visual sensors. Second, all agents perform better and generalize more on Gibson rather than Matterport3D since the size of environments in the latter is bigger. Third, agents with only depth sensors generalize across datasets the best and can achieve the highest SPL. However, most importantly, they realized that unlike what has been mentioned in the previous work, if the PPO agent learns long enough, it will eventually outperform the traditional SLAM pipeline. This finding was only possible because the Habitat simulator was fast enough to train PPO agents for 75 million time steps as opposed to only 5 million time steps in the previous investigations.

Cyberphysical Smart Cities Infrastructures

Подняться наверх