Visual Navigation with Multiple Goals based on Deep Reinforcement Learning

Zhenhuan Rao1, Yuechen Wu1, Zifei Yang1, Wei Zhang1*, Shijian Lu2, Weizhi Lu1, and Zhengjun Zha3
1School of Control Science and Engineering, Shandong University  
2School of Computer Science and Engineering, Nanyang Technological University  
3School of Information Science and Technology, University of Science and Technology of China  

Abstract

Learning to adapt to a series of different goals in visual navigation is challenging. In this work, we present a model-embedded actor-critic architecture for the multi-goal visual navigation task. To enhance the task cooperation in multi-goal learning, we introduce two new designs to the reinforcement learning scheme: inverse dynamics model (InvDM) and multi-goal co-learning (MgCl). Specifically, InvDM is proposed to capture the navigation-relevant association between state and goal, and provide additional training signals to relieve the sparse reward issue. MgCl aims at improving the sample efficiency and supports the agent to learn from unintentional positive experiences. Besides, to further improve the scene generalization capability of the agent, we present an enhanced navigation model that consists of two self-supervised auxiliary task modules. The first module, which is called path closed-loop detection, helps to understand whether the state is experienced. The second one, namely state-target matching module, tries to figure out the difference between state and goal. Extensive results on the interactive platform AI2-THOR demonstrate that the agent trained with the proposed method converges faster than state-of-the-art methods while owns good generalization capability.


Overview of our deep RL framework for visual navigation.

● Data & Code: Coming soon!

● Video: