Giter Club home page Giter Club logo

zson's Introduction

ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings

This is a PyTorch implementation of the NeurIPS-22 paper: https://arxiv.org/abs/2206.12403

Arjun Majumdar*, Gunjan Aggarwal*, Bhavika Devnani, Judy Hoffman and Dhruv Batra

Georgia Institute of Technology, Meta AI

Details

We present a scalable approach for learning open-world object-goal navigation (ObjectNav) – the task of asking a virtual robot (agent) to find any instance of an object in an unexplored environment (e.g., “find a sink”). Our approach is entirely zero-shot – i.e., it does not require ObjectNav rewards or demonstrations of any kind.

Model Architecture for ZSON.

Installation

All the required data can be downloaded from here.

  1. Create a conda environment:

    conda create -n zson python=3.7 cmake=3.14.0
    
    conda activate zson
    
  2. Install pytorch version 1.10.2:

    conda install pytorch==1.10.2 torchvision==0.11.3 cudatoolkit=11.3 -c pytorch -c conda-forge
    
  3. Install habitat-sim:

    conda install habitat-sim-challenge-2022 headless -c conda-forge -c aihabitat
    
  4. Install habitat-lab:

    git clone --branch challenge-2022 https://github.com/facebookresearch/habitat-lab.git habitat-lab-challenge-2022
    
    cd habitat-lab-challenge-2022
    
    pip install -r requirements.txt
    
    python setup.py develop --all # install habitat and habitat_baselines
    
    cd ..
    

Download and Install zson:

  1. Setup steps

    git clone [email protected]:gunagg/zson.git
    
    cd zson
    
    pip install -r requirements.txt
    
    python setup.py develop
    
  2. Follow the instructions here to set up the data/scene_datasets/ directory. gibson scenes can be found here.

  3. Download the HM3D ImageNav training dataset:

    wget https://huggingface.co/gunjan050/ZSON/resolve/main/imagenav_hm3d.zip
    
    unzip imagenav_hm3d.zip
    
    rm imagenav_hm3d.zip  # clean-up
    
  4. Download the MP3D objectnav dataset.

    wget https://dl.fbaipublicfiles.com/habitat/data/datasets/objectnav/m3d/v1/objectnav_mp3d_v1.zip
    
    mkdir -p data/datasets/objectnav/mp3d/v1
    
    unzip objectnav_mp3d_v1.zip -d data/datasets/objectnav/mp3d/v1
    
    rm objectnav_mp3d_v1.zip  # clean-up
    
  5. Download the HM3D objectnav dataset.

    wget https://dl.fbaipublicfiles.com/habitat/data/datasets/objectnav/hm3d/v1/objectnav_hm3d_v1.zip
    
    unzip objectnav_hm3d_v1.zip -d data/datasets/objectnav/
    
    rm objectnav_hm3d_v1.zip  # clean-up
    
  6. Download the trained checkpoints zson_conf_A.pth and zson_conf_B.pth, and move to data/checkpoints.

  7. To train policies using OVRL pretrained RGB encoder, download the model weights from here and move to data/models/. More details on the encoder can be found here.

  8. Setup data/goal_datasets using the script tools/extract-goal-features.py. This caches CLIP goal embeddings for faster training.

    Your directory structure should now look like this:

    .
    +-- habitat-lab-v0.2.1/
    |   ...
    +-- zson/
    |   +-- data/
    |   |   +-- datasets/
    |   |   |   +-- objectnav/
    |   |   |   +-- imagenav/
    |   |   +-- scene_datasets/
    |   |   |   +-- hm3d/
    |   |   |   +-- mp3d/
    |   |   +-- goal_datasets/
    |   |   |   +-- imagenav/
    |   |   |   |   +-- hm3d/
    |   |   +-- models/
    |   |   +-- checkpoints/
    |   +-- zson/
    |   ...
    

Usage

ZSON configuration A ImageNav Training

sbatch scripts/imagenav-v1-hm3d-ovrl-rn50.sh

ZSON configuration B ImageNav Training

sbatch scripts/imagenav-v2-hm3d-ovrl-rn50.sh

ObjectNav Evaluation

To evaluate a checkpoint trained using ZSON checkpoint use the following command:

sbatch scripts/objnav-eval-$DESIRED-CONFIGURATION$-$DATASET$.sh

Citation

If you use this code in your research, please consider citing:

@inproceedings{majumdar2022zson,
  title={ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings},
  author={Majumdar, Arjun and Aggarwal, Gunjan and Devnani, Bhavika and Hoffman, Judy and Batra, Dhruv},
  booktitle={Neural Information Processing Systems (NeurIPS)},
  year={2022}
}

zson's People

Contributors

gunagg avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.