Giter Club home page Giter Club logo

habitat-challenge's Introduction


Habitat Navigation Challenge 2023

This repository contains the starter code for the 2023 Habitat [1] challenge, details of the tasks, and training and evaluation setups. For an overview of habitat-challenge, visit aihabitat.org/challenge.

If you are looking for our 2022/2021/2020/2019 starter code, it’s available in the challenge-YEAR branch.

This year, we are hosting a challenges on the ObjectNav and ImageNav embodied navigation task.

Task #1: ObjectNav focuses on egocentric object/scene recognition and a commonsense understanding of object semantics (where is a bed typically located in a house?).

Task #2: ImageNav focuses on visual reasoning and embodied instance disambiguation (is the particular chair I observe the same one depicted by the goal image?).

New in 2023

  • We are instantiating ObjectNav on a new version of the HM3D-Semantics dataset called HM3D-Semantics v0.2.
  • We are announcing the ImageNav track, also on the HM3D-Semantics v0.2 scene dataset.
  • We are introducing several changes in the agent config for easier sim-to-real transfer. We are using the HelloRobot Stretch robot configuration with support of continuous action space and updating the dataset such that all episodes can be navigated without traversing between floors.

Task: ObjectNav

In ObjectNav, an agent is initialized at a random starting position and orientation in an unseen environment and asked to find an instance of an object category (‘find a chair’) by navigating to it. A map of the environment is not provided and the agent must only use its sensory input to navigate.

The agent is modeled after the Hello Stretch robot and equipped with an RGB-D camera and a (noiseless) GPS+Compass sensor. GPS+Compass sensor provides the agent’s current location and orientation information relative to the start of the episode.

Dataset

The 2023 ObjectNav challenge uses 216 scenes from the HM3D-Semantics v0.2 [2] dataset with train/val/test splits on 145/36/35. Following Chaplot et al. [3], we use 6 object goal categories: chair, couch, potted plant, bed, toilet and tv. All episodes can be navigated without traversing between floors.

Task: ImageNav

In ImageNav, an agent is initialized at a random start pose in an unseen environment and given an RGB goal image. We adopt the Instance ImageNav [4] task definition where the goal image depicts a particular object instance and the agent is asked to navigate to that object.

The goal camera is disentangled from the agent's camera; sampled parameters such as height, look-at-angle, and field-of-view reflect the realistic use case of a user-supplied goal image.

Similar to ObjectNav, the agent is modeled after the Hello Stretch robot and equipped with an RGB-D camera and a (noiseless) GPS+Compass sensor.

Dataset

The 2023 ImageNav challenge uses 216 scenes from the HM3D-Semantics v0.2[2] dataset with train/val/test splits on 145/36/35. Following Krantz et al. [4], we sample goal images depicting object instances belonging to the same 6 goal categories used in the ObjectNav challenge: chair, couch, potted plant, bed, toilet, and tv. All episodes can be navigated without traversing between floors.

Action Space

To allow easier sim-to-real transfer of the policies from simulation to the Stretch Robot, we are changing the agent's action space from discrete space to continuous space. The agent now accepts the following actions:

  1. linear_velocity: Moves the agent forward or backward. Accepts values between [-1,1], scaled according to lin_vel_range defined in the VelocityControlActionConfig.
  2. angular_velocity: Moves the agent left or right. Accepts values between [-1,1], scaled according to ang_vel_range defined in the VelocityControlActionConfig.
  3. camera_pitch_velocity: Tilts the camera up or down. Accepts values between [-1,1], scaled according to ang_vel_range_camera_pitch defined in the VelocityControlActionConfig.
  4. velocity_stop: Action used for ending the episode. Accepts values between [-1,1]. Value greater than 0 ends the episode.

While the agent accepts actions only in the continuous space, we are also providing the following set of controllers that will allow policy to predict actions in a more abstract action space:

  1. Waypoint Controller: The waypoint controller takes in the following inputs and calculates the velocity commands that are passed to the simulator:
    1. xyt_waypoint: Moves the agent to a waypoint (x, y) and turns the agent by t radians. Accepts values between [-1,1], scaled according to waypoint_lin_range and waypoint_ang_range defined in the WaypointControlActionConfig.
    2. max_duration: The amount of seconds the waypoint controller should take steps in the simulator before asking the policy for the next waypoint. Accepts values between [0,1], scaled according to wait_duration_range defined in the WaypointControlActionConfig.
    3. delta_camera_pitch_angle: np.random.rand(1), Accepts values between [-1,1], scaled according to ang_vel_range from the WaypointControlActionConfig.
    4. velocity_stop: Action used for ending the episode. Accepts values between [-1,1]. Value greater than 0 ends the episode.
  2. Discrete Waypoint Controller: This controller allows you to try out the policies trained with the discrete action space that was used in the older versions of the navigation tasks in Habitat. The controller accepts one of the following actions:
    1. move_forward_waypoint: Moves the agent forward by 25 centimeters.
    2. turn_left_waypoint: Turns the agent towards the left by 30 degrees.
    3. turn_right_waypoint: Turns the agent towards the left by 30 degrees.
    4. look_up_discrete_to_velocity: Tilts the camera upwards by 30 degrees, while respecting the maximum tilt angle defined by ang_range_camera_pitch in the VelocityControlActionConfig.
    5. look_down_discrete_to_velocity: Tilts the camera downwards by 30 degrees, while respecting the minimum tilt angle defined by ang_range_camera_pitch in the VelocityControlActionConfig.

Evaluation

Similar to 2022 Habitat Challenge, we measure performance along the same two axes as specified by Anderson et al.[4]:

  • Success: Did the agent navigate to an instance of the goal object? (Notice: any instance, regardless of distance from starting location.)

    Concretely, an episode is deemed successful if on calling the STOP action, the agent is within 1.0m Euclidean distance from any instance of the target object category AND the object can be viewed by an oracle from that stopping position by turning the agent or looking up/down. Notice: we do NOT require the agent to be actually viewing the object at the stopping location, simply that such oracle-visibility is possible without moving. Why? Because we want participants to focus on navigation, not object framing. In Embodied AI’s larger goal, the agent is navigating to an object instance to interact with it (say point at or manipulate an object). Oracle-visibility is our proxy for ‘the agent is close enough to interact with the object’.

  • SPL: How efficient was the agent’s path compared to an optimal path? (Notice: for ObjectNav, optimal path = shortest path from the agent’s starting position to the closest instance of the target object category.)

After calling the STOP action, the agent is evaluated using the ‘Success weighted by Path Length’ (SPL) metric [4].

ObjectNav-SPL is defined analogous to PointNav-SPL. The only key difference is that the shortest path is computed to the object instance closest to the agent start location. Thus, if an agent spawns very close to ‘chair1’ but stops at a distant ‘chair2’, it will achieve 100% success (because it found a ‘chair’) but a fairly low SPL (because the agent path is much longer compared to the oracle path). ImageNav-SPL is similar to ObjectNav-SPL except that there is exactly one correct object instance (shown in the goal image).

We reserve the right to use additional metrics to choose winners in case of statistically insignificant SPL differences.

Participation Guidelines

Participate in the contest by registering on the EvalAI challenge page and creating a team. Participants will upload docker containers with their agents that are evaluated on an AWS GPU-enabled instance. Before pushing the submissions for remote evaluation, participants should test the submission docker locally to ensure it is working. Instructions for training, local evaluation, and online submission are provided below.

For your convenience, please check our Habitat Challenge video tutorial and Colab step-by-step tutorial from previous year.

Local Evaluation

  1. Clone the challenge repository:

    git clone https://github.com/facebookresearch/habitat-challenge.git
    cd habitat-challenge
  2. Implement your own agent or try one of ours. We provide an agent in agents/agent.py that takes random actions:

    import habitat
    from omegaconf import DictConfig
    
    class RandomAgent(habitat.Agent):
        def __init__(self, task_config: DictConfig):
            self._task_config = task_config
    
        def reset(self):
            pass
    
        def act(self, observations):
            return {
                'action': ("velocity_control", "velocity_stop"),
                'action_args': {
                    "angular_velocity": np.random.rand(1),
                    "linear_velocity": np.random.rand(1),
                    "camera_pitch_velocity": np.random.rand(1),
                    "velocity_stop": np.random.rand(1),
                }
            }
    
    
    def main():
        agent = RandomAgent(task_config=config)
        challenge = habitat.Challenge()
        challenge.submit(agent)
  3. Install nvidia-docker v2 following instructions here: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker. Note: only supports Linux; no Windows or MacOS.

  4. Modify the provided Dockerfile (docker/{ObjectNav, ImageNav}_random_baseline.Dockerfile) if you need custom modifications. Let’s say your code needs pytorch, these dependencies should be pip installed inside a conda environment called habitat that is shipped with our habitat-challenge docker, as shown below:

    FROM fairembodied/habitat-challenge:habitat_navigation_2023_base_docker
    
    # install dependencies in the habitat conda environment
    RUN /bin/bash -c ". activate habitat; pip install torch"
    
    ADD agents/agent.py /agent.py
    ADD submission.sh /submission.sh

    Build your docker container using: docker build . --file docker/{ObjectNav, ImageNav}_random_baseline.Dockerfile -t {objectnav, imagenav}_submission.

    Note #1: you may need sudo privileges to run this command.

    Note #2: Please make sure that you keep your local version of fairembodied/habitat-challenge:habitat_navigation_2023_base_docker image up to date with the image we have hosted on dockerhub. This can be done by pruning all cached images, using:

    docker system prune -a
    

    [Optional] Modify submission.sh file if your agent needs any custom modifications (e.g. command-line arguments). Otherwise, nothing to do. Default submission.sh is simply a call to RandomAgent agent in agent.py

  5. Scene Dataset: Download Habitat-Matterport3D Dataset scenes used for Habitat Challenge here. Place this data in: habitat-challenge/habitat-challenge-data/data/scene_datasets/hm3d_v0.2

    Using Symlinks: If you used symlinks (i.e. ln -s) to link to an existing download of HM3D, there is an additional step. First, make sure there is only one level of symlink (instead of a symlink to a symlink link to a .... symlink) with

    ln -f -s $(realpath habitat-challenge-data/data/scene_datasets/hm3d_v0.2) \
        habitat-challenge-data/data/scene_datasets/hm3d_v0.2

    Then modify the docker command in scripts/test_local_{objectnav, imagenav}.sh file to mount the linked to location by adding -v $(realpath habitat-challenge-data/data/scene_datasets/hm3d_v0.2):/habitat-challenge-data/data/scene_datasets/hm3d_v0.2. The modified docker command would be

    # ObjectNav
    docker run \
         -v $(pwd)/habitat-challenge-data:/habitat-challenge-data \
         -v $(realpath habitat-challenge-data/data/scene_datasets/hm3d_v0.2):/habitat-challenge-data/data/scene_datasets/hm3d_v0.2 \
         --runtime=nvidia \
         -e "AGENT_EVALUATION_TYPE=local" \
         -e "TRACK_CONFIG_FILE=/configs/benchmark/nav/objectnav/objectnav_v2_hm3d_stretch_challenge.yaml" \
         ${DOCKER_NAME}
    
    # ImageNav
    docker run \
         -v $(pwd)/habitat-challenge-data:/habitat-challenge-data \
         -v $(realpath habitat-challenge-data/data/scene_datasets/hm3d_v0.2):/habitat-challenge-data/data/scene_datasets/hm3d_v0.2 \
         --runtime=nvidia \
         -e "AGENT_EVALUATION_TYPE=local" \
         -e "TRACK_CONFIG_FILE=/configs/benchmark/nav/imagenav/imagenav_hm3d_v3_challenge.yaml" \
         ${DOCKER_NAME}
  6. Evaluate your docker container locally:

    # Testing ObjectNav
    ./scripts/test_local_objectnav.sh --docker-name objectnav_submission
    
    # Testing ImageNav
    ./scripts/test_local_imagenav.sh --docker-name imagenav_submission

    If the above command runs successfully you will get an output similar to:

    2023-03-01 16:35:02,244 distance_to_goal: 6.446822468439738
    2023-03-01 16:35:02,244 success: 0.0
    2023-03-01 16:35:02,244 spl: 0.0
    2023-03-01 16:35:02,244 soft_spl: 0.0014486297806195665
    2023-03-01 16:35:02,244 num_steps: 1.0
    2023-03-01 16:35:02,244 collisions/count: 0.0
    2023-03-01 16:35:02,244 collisions/is_collision: 0.0
    2023-03-01 16:35:02,244 distance_to_goal_reward: 0.0009365876515706381
    

    Note: this same command will be run to evaluate your agent for the leaderboard. Please submit your docker for remote evaluation (below) only if it runs successfully on your local setup.

  7. If you want to try out one of the controllers we provide, change the "--action_space" in the dockerfile (docker/{ObjectNav, ImageNav}_random_baseline.Dockerfile) to use either waypoint_controller or discrete_waypoint_controller.

Online submission

Follow instructions in the submit tab of the EvalAI challenge page to submit your docker image. Note that you will need a version of EvalAI >= 1.2.3. Pasting those instructions here for convenience:

# Installing EvalAI Command Line Interface
pip install "evalai>=1.3.5"

# Set EvalAI account token
evalai set_token <your EvalAI participant token>

# Push docker image to EvalAI docker registry
# ObjectNav
evalai push objectnav_submission:latest --phase <phase-name>

# ImageNav
evalai push imagenav_submission:latest --phase <phase-name>

The challenge consists of the following phases:

  1. Minival phase: This split is the same as the one used in ./scripts/test_local_{objectnav, imagenav}.sh. The purpose of this phase/split is sanity checking -- to confirm that our remote evaluation reports the same result as the one you’re seeing locally. Each team is allowed maximum of 100 submissions per day for this phase, but please use them judiciously. We will block and disqualify teams that spam our servers.
  2. Test Standard phase: The purpose of this phase/split is to serve as the public leaderboard establishing the state of the art; this is what should be used to report results in papers. Each team is allowed maximum of 10 submissions per day for this phase, but again, please use them judiciously. Don’t overfit to the test set.
  3. Test Challenge phase: This phase/split will be used to decide challenge winners. Each team is allowed a total of 5 submissions until the end of challenge submission phase. The highest performing of these 5 will be automatically chosen. Results on this split will not be made public until the announcement of final results at the Embodied AI workshop at CVPR.

Note: Your agent will be evaluated on 1000 episodes and will have a total available time of 48 hours to finish. Your submissions will be evaluated on AWS EC2 p2.xlarge instance which has a Tesla K80 GPU (12 GB Memory), 4 CPU cores, and 61 GB RAM. If you need more time/resources for evaluation of your submission please get in touch. If you face any issues or have questions you can ask them by opening an issue on this repository.

ObjectNav/ImageNav Baselines and DD-PPO Training Starter Code

We have added a config in configs/ddppo_objectnav_v2_hm3d_stretch.yaml | configs/ddppo_imagenav_v3_hm3d_stretch.yaml that includes a baseline using DD-PPO from Habitat-Lab.

  1. Install the Habitat-Sim and Habitat-Lab packages. You can install Habitat-Sim using our custom Conda package for habitat challenge 2023 with: conda install -c aihabitat habitat-sim-challenge-2023. For Habitat-Lab, we have created the habitat-challenge-2023 tag in our Github repo, which can be cloned using: git clone --branch challenge-2023 https://github.com/facebookresearch/habitat-lab.git. Please ensure that both habitat-lab and habitat-baselines packages are installed using pip install -e habitat-lab and pip install -e habitat-baselines. You will find further information for installation in the Github repositories.

  2. Download the HM3D scene dataset following the instructions here. After downloading extract the dataset to folder habitat-lab/data/scene_datasets/hm3d_v0.2/ folder (this folder should contain the .glb files from HM3D). Note that the habitat-lab folder is the habitat-lab repository folder. You could also just symlink to the path of the HM3D scenes downloaded in step-4 of local-evaluation under the habitat-challenge/habitat-challenge-data/data/scene_datasets folder. This can be done using ln -s /path/to/habitat-challenge-data/data/scene_datasets /path/to/habitat-lab/data/scene_datasets/ (if on OSX or Linux).

  3. ObjectNav: Download the episodes dataset for HM3D ObjectNav from link and place it in the folder habitat-challenge/habitat-challenge-data/data/datasets/objectnav/hm3d. If placed correctly, you should have the train and val splits at habitat-challenge/habitat-challenge-data/data/datasets/objectnav/hm3d/v2/train/ and habitat-challenge/habitat-challenge-data/data/datasets/objectnav/hm3d/v2/val/ respectively.

    ImageNav Download the episodes dataset for HM3D InstanceImageNav from link and place it in the folder habitat-challenge/habitat-challenge-data/data/datasets/instance_imagenav/hm3d. If placed correctly, you should have the train and val splits at habitat-challenge/habitat-challenge-data/data/datasets/instance_imagenav/hm3d/v3/train/ and habitat-challenge/habitat-challenge-data/data/datasets/instance_imagenav/hm3d/v3/val/ respectively.

  4. An example on how to train DD-PPO model can be found in habitat-lab/habitat-baselines/habitat_baselines/rl/ddppo. See the corresponding README in habitat-lab for how to adjust the various hyperparameters, save locations, visual encoders and other features.

    1. To run on a single machine use the script single_node.sh from the habitat-lab directory, where $task={objectnav_v2, imagenav_v3}:
      #/bin/bash
      
      export GLOG_minloglevel=2
      export MAGNUM_LOG=quiet
      
      set -x
      
      python -u -m torch.distributed.launch \
          --use_env \
          --nproc_per_node 1 \
          habitat_baselines/run.py \
          --config-name=configs/ddppo_${task}_hm3d_stretch.yaml
    2. There is also an example script named multi_node_slurm.sh for running the code in distributed mode on a cluster with SLURM. While this is not necessary, if you have access to a cluster, it can significantly speed up training. To run on multiple machines in a SLURM cluster run the following script: change #SBATCH --nodes $NUM_OF_MACHINES to the number of machines and #SBATCH --ntasks-per-node $NUM_OF_GPUS and $SBATCH --gpus $NUM_OF_GPUS to specify the number of GPUS to use per requested machine.
      #!/bin/bash
      #SBATCH --job-name=ddppo
      #SBATCH --output=logs.ddppo.out
      #SBATCH --error=logs.ddppo.err
      #SBATCH --gpus 1
      #SBATCH --nodes 1
      #SBATCH --cpus-per-task 10
      #SBATCH --ntasks-per-node 1
      #SBATCH --mem=60GB
      #SBATCH --time=72:00:00
      #SBATCH --signal=USR1@90
      #SBATCH --requeue
      #SBATCH --partition=dev
      
      export GLOG_minloglevel=2
      export MAGNUM_LOG=quiet
      
      MAIN_ADDR=$(scontrol show hostnames "${SLURM_JOB_NODELIST}" | head -n 1)
      export MAIN_ADDR
      
      set -x
      srun python -u -m habitat_baselines.run \
          --config-name=configs/ddppo_${task}_hm3d_stretch.yaml
  5. The checkpoint specified by $PATH_TO_CHECKPOINT can evaluated based on the SPL and other measurements by running the following command:

    python -u -m habitat_baselines.run \
        --config-name=configs/ddppo_${task}_hm3d_stretch.yaml \
        habitat_baselines.evaluate=True \
        habitat_baselines.eval_ckpt_path_dir=$PATH_TO_CHECKPOINT \
        habitat.dataset.data_path.split=val

    The weights used for our DD-PPO Objectnav or Imagenav baseline for the Habitat-2023 challenge can be downloaded with the following command:

    wget https://dl.fbaipublicfiles.com/habitat/data/baselines/v1/{task}_baseline_habitat_navigation_challenge_2023.pth

    where $task={objectnav, imagenav}.

  6. To submit your entry via EvalAI, you will need to build a docker file. We provide Dockerfiles ready to use with the DD-PPO baselines in docker/{ObjectNav, ImageNav}_ddppo_baseline.Dockerfile. For the sake of completeness, we describe how you can make your own Dockerfile below. If you just want to test the baseline code, feel free to skip this bullet because ObjectNav_ddppo_baseline.Dockerfile is ready to use.

    1. You may want to modify the {ObjectNav, ImageNav}_ddppo_baseline.Dockerfile to include PyTorch or other libraries. To install pytorch, ifcfg and tensorboard, add the following command to the Docker file:

      RUN /bin/bash -c ". activate habitat; pip install ifcfg torch tensorboard"
    2. You change which agent.py and which submission.sh script is used in the Docker, modify the following lines and replace the first agent.py or submission.sh with your new files:

      ADD agents/agent.py agent.py
      ADD submission.sh submission.sh
    3. Do not forget to add any other files you may need in the Docker, for example, we add the demo.ckpt.pth file which is the saved weights from the DD-PPO example code.

    4. Finally, modify the submission.sh script to run the appropriate command to test your agents. The scaffold for this code can be found in agent.py and the DD-PPO specific agent can be found in habitat_baselines_agents.py. In this example, we only modify the final command of the ObjectNav/ImageNav docker: by adding the following args to submission.sh --model-path demo.ckpt.pth --input-type rgbd. The default submission.sh script will pass these args to the python script. You may also replace the submission.sh.

  7. Once your Dockerfile and other code is modified to your satisfaction, build it with the following command.

    docker build . --file docker/{ObjectNav, ImageNav}_ddppo_baseline.Dockerfile -t {objectnav, imagenav}_submission
  8. To test locally simple run the scripts/test_local_{objectnav, imagenav}.sh script. If the docker runs your code without errors, it should work on Eval-AI. The instructions for submitting the Docker to EvalAI are listed above.

  9. Happy hacking!

Citing Habitat Challenge 2023

Please cite the following bibtex when referring to the 2023 Navigation challenge:

@misc{habitatchallenge2023,
  title         =     Habitat Challenge 2023,
  author        =     {Karmesh Yadav and Jacob Krantz and Ram Ramrakhya and Santhosh Kumar Ramakrishnan and Jimmy Yang and Austin Wang and John Turner and Aaron Gokaslan and Vincent-Pierre Berges and Roozbeh Mootaghi and Oleksandr Maksymets and Angel X Chang and Manolis Savva and Alexander Clegg and Devendra Singh Chaplot and Dhruv Batra},
  howpublished  =     {\url{https://aihabitat.org/challenge/2023/}},
  year          =     {2023}
}

Acknowledgments

The Habitat challenge would not have been possible without the infrastructure and support of EvalAI team. We also thank the team behind Habitat-Matterport3D and HM3D-Semantics datasets.

References

[1] Habitat: A Platform for Embodied AI Research. Manolis Savva*, Abhishek Kadian*, Oleksandr Maksymets*, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, Dhruv Batra. IEEE/CVF International Conference on Computer Vision (ICCV), 2019.

[2] Habitat-Matterport 3D Semantics Dataset (HM3DSem). Karmesh Yadav*, Ram Ramrakhya*, Santhosh Kumar Ramakrishnan*, Theo Gervet, John Turner, Aaron Gokaslan, Noah Maestre, Angel Xuan Chang, Dhruv Batra, Manolis Savva, Alexander William Clegg^, Devendra Singh Chaplot^. arXiv:2210.05633, 2022.

[3] Object Goal Navigation using Goal-Oriented Semantic Exploration Devendra Singh Chaplot, Dhiraj Gandhi, Abhinav Gupta, Ruslan Salakhutdinov. NeurIPS, 2020.

[4] Instance-Specific Image Goal Navigation: Training Embodied Agents to Find Object Instances. Jacob Krantz, Stefan Lee, Jitendra Malik, Dhruv Batra, Devendra Singh Chaplot. arxiv:2211.15876, 2022.

[5] On evaluation of embodied navigation agents. Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva, Amir R. Zamir. arXiv:1807.06757, 2018.

habitat-challenge's People

Contributors

abhiskk avatar aclegg3 avatar aszot avatar dhruvbatra avatar erikwijmans avatar facebook-github-bot avatar fxia22 avatar jacobkrantz avatar mathfac avatar rishabhjain2018 avatar rpartsey avatar skylion007 avatar ykarmesh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

habitat-challenge's Issues

Evalai submissions are not scheduled

Evalai submissions don't seem to be scheduled anymore. I have submissions to pointnav test-std and minival that have remained in "submitted" status for over a week now

Missing cuda in docker for submission

My code requires this library for running: https://github.com/rusty1s/pytorch_scatter . When I try installing it, it throws out the following error:

  No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.6
  creating build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/sub.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/add.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/div.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/mul.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/std.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/__init__.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/mean.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/logsumexp.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/min.py -> build/lib.linux-x86_64-3.6/torch_scatter
  copying torch_scatter/max.py -> build/lib.linux-x86_64-3.6/torch_scatter
  creating build/lib.linux-x86_64-3.6/test
  copying test/test_backward.py -> build/lib.linux-x86_64-3.6/test
  copying test/__init__.py -> build/lib.linux-x86_64-3.6/test
  copying test/test_logsumexp.py -> build/lib.linux-x86_64-3.6/test
  copying test/test_max_min.py -> build/lib.linux-x86_64-3.6/test
  copying test/utils.py -> build/lib.linux-x86_64-3.6/test
  copying test/test_std.py -> build/lib.linux-x86_64-3.6/test
  copying test/test_multi_gpu.py -> build/lib.linux-x86_64-3.6/test
  copying test/test_forward.py -> build/lib.linux-x86_64-3.6/test
  copying test/test_broadcasting.py -> build/lib.linux-x86_64-3.6/test
  creating build/lib.linux-x86_64-3.6/torch_scatter/composite
  copying torch_scatter/composite/__init__.py -> build/lib.linux-x86_64-3.6/torch_scatter/composite
  copying torch_scatter/composite/softmax.py -> build/lib.linux-x86_64-3.6/torch_scatter/composite
  creating build/lib.linux-x86_64-3.6/torch_scatter/utils
  copying torch_scatter/utils/ext.py -> build/lib.linux-x86_64-3.6/torch_scatter/utils
  copying torch_scatter/utils/__init__.py -> build/lib.linux-x86_64-3.6/torch_scatter/utils
  copying torch_scatter/utils/gen.py -> build/lib.linux-x86_64-3.6/torch_scatter/utils
  running build_ext
  building 'torch_scatter.scatter_cpu' extension
  creating build/temp.linux-x86_64-3.6
  creating build/temp.linux-x86_64-3.6/cpu
  gcc -pthread -B /opt/conda/envs/habitat/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/THC -I/opt/conda/envs/habitat/include/python3.6m -c cpu/scatter.cpp -o build/temp.linux-x86_64-3.6/cpu/scatter.o -Wno-unused-variable -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=scatter_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
  cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++
  g++ -pthread -shared -B /opt/conda/envs/habitat/compiler_compat -L/opt/conda/envs/habitat/lib -Wl,-rpath=/opt/conda/envs/habitat/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/cpu/scatter.o -o build/lib.linux-x86_64-3.6/torch_scatter/scatter_cpu.cpython-36m-x86_64-linux-gnu.so
  building 'torch_scatter.scatter_cuda' extension
  creating build/temp.linux-x86_64-3.6/cuda
  gcc -pthread -B /opt/conda/envs/habitat/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/envs/habitat/include/python3.6m -c cuda/scatter.cpp -o build/temp.linux-x86_64-3.6/cuda/scatter.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=scatter_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
  cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++
  /usr/local/cuda/bin/nvcc -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/TH -I/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/opt/conda/envs/habitat/include/python3.6m -c cuda/scatter_kernel.cu -o build/temp.linux-x86_64-3.6/cuda/scatter_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=scatter_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
  unable to execute '/usr/local/cuda/bin/nvcc': No such file or directory
  error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for torch-scatter

Looks like nvcc is missing. I tried manually looking at the base docker. I couldn't find nvcc. Should I manually install it in that case?

Category list of Task 2

Hi,

As the description said there is a subset of the original MP3D category list that only contains 21 items. I was wondering if it is publicly accessible? If it is, where we can find it?

Thanks!

Option to terminate running submission

Hi,

Is there an option to terminate a submission currently being executed?
If not, could you manually terminate all my 3 submissions in the test-std track under user name "karkus" or increase the max number of parallel submissions?

I believe the scripts are hanging because of an extra process i launch, and wont exit unless the container is terminated.

thanks!

Submission failed without any trace

I have just tried to push my submission, however, soon (411.10 sec) I’ve got "Submission Status: failed". It seems that log files (stderr and stdout, as well as, results) are empty. Is there anything I can do to get more information regarding to this failure to debug my submission?

And yes, I tested my container with ./test_locally_{}.sh
To: habitat19-rgbd-minival

(duplicate from eval forum)

Failed to build DD-PPO docker

Just wanted to make initial submission with DD-PPO file without own training.

After running command:
docker build . --file Pointnav_DDPPO_Baseline.Dockerfile -t pointnav_submission
get an error:
ADD failed: stat /var/lib/docker/tmp/docker-builder885297584/demo.ckpt.pth: no such file or directory

Where can I find demo.ckpt.pth file for initial submission ?

Inconsistent Scores Between Local and Remote minival for ObjectNav

I followed the submission criteria and made a docker submission for minival. Locally, it consistently reaches a given distance to goal, but remotely, it receives a different, higher distance to goal.
image
image
(where the 6.63 is distance_to_goal)
Is this noise? I'm surprised, as I received the same results on different machines, and also different runs of the same container (as shown in first screenshot).

Testing 2020 starter code

Several problems.

Install nvidia-docker. Note: only supports Linux; no Windows or MacOS.

Do we need nvidia-docker v1 or v2?

./test_locally_objectnav_rgbd.sh --docker-name my_submission

Both test_locally_*.sh files were not executable (needed chmod a+x).

  1. Docker commands failed -- I suspect because of incorrect naming in submissions.sh

(base) dbatra@ubuntu-bionic-1:~/habitat-challenge$ ./test_locally_objectnav_rgbd.sh --docker-name my_submission
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/containers/create: dial unix /var/run/docker.sock: connect: permission denied.
See 'docker run --help'.

(base) dbatra@ubuntu-bionic-1:~/habitat-challenge$ sudo ./test_locally_objectnav_rgbd.sh --docker-name my_submission
docker: Error response from daemon: Unknown runtime specified nvidia.
See 'docker run --help'.

No such file or directory: 'habitat-challenge-data/objectgoal_mp3d/val_mini/val_mini.json.gz'

Hi. I am trying to run the test_locally_objectnav_rgbd.sh script with the provided RandomAgent. I am following the Local Evaluation instruction:

  1. Cloned the master branch
  2. Built the objectnav_submission image using
docker build . --file Objectnav.Dockerfile -t objectnav_submission
  1. Nvidia-docker installed and working
  2. No modifications to the Dockerfile. Using all defaults.
  3. This is my directory structure:
tree habitat-challenge-data
habitat-challenge-data
├── data
│   ├── datasets
│   │   └── objectnav
│   │       └── mp3d
│   │           └── v1
│   │               ├── objectnav_mp3d_v1.zip
│   │               ├── train
│   │               │   ├── content
│   │               │   │   ├── 17DRP5sb8fy.json.gz
│   │               │   │   ├── 1LXtFkjw3qL.json.gz
│   │               │   │   ├── 1pXnuDYAj8r.json.gz
│   │               │   │   ├── 29hnd4uzFmX.json.gz
│   │               │   │   ├── 5LpN3gDmAk7.json.gz
│   │               │   │   ├── 5q7pvUzZiYa.json.gz
│   │               │   │   ├── 759xd9YjKW5.json.gz
│   │               │   │   ├── 7y3sRwLe3Va.json.gz
│   │               │   │   ├── 82sE5b5pLXE.json.gz
│   │               │   │   ├── 8WUmhLawc2A.json.gz
│   │               │   │   ├── aayBHfsNo7d.json.gz
│   │               │   │   ├── ac26ZMwG7aT.json.gz
│   │               │   │   ├── B6ByNegPMKs.json.gz
│   │               │   │   ├── b8cTxDM8gDG.json.gz
│   │               │   │   ├── cV4RVeZvu5T.json.gz
│   │               │   │   ├── D7G3Y4RVNrH.json.gz
│   │               │   │   ├── D7N2EKCX4Sj.json.gz
│   │               │   │   ├── dhjEzFoUFzH.json.gz
│   │               │   │   ├── E9uDoFAP3SH.json.gz
│   │               │   │   ├── e9zR4mvMWw7.json.gz
│   │               │   │   ├── EDJbREhghzL.json.gz
│   │               │   │   ├── GdvgFV5R1Z5.json.gz
│   │               │   │   ├── gZ6f7yhEvPG.json.gz
│   │               │   │   ├── HxpKQynjfin.json.gz
│   │               │   │   ├── i5noydFURQK.json.gz
│   │               │   │   ├── JeFG25nYj2p.json.gz
│   │               │   │   ├── JF19kD82Mey.json.gz
│   │               │   │   ├── jh4fc5c5qoQ.json.gz
│   │               │   │   ├── kEZ7cmS4wCh.json.gz
│   │               │   │   ├── mJXqzFtmKg4.json.gz
│   │               │   │   ├── p5wJjkQkbXX.json.gz
│   │               │   │   ├── Pm6F8kyY3z2.json.gz
│   │               │   │   ├── pRbA3pwrgk9.json.gz
│   │               │   │   ├── PuKPg4mmafe.json.gz
│   │               │   │   ├── PX4nDJXEHrG.json.gz
│   │               │   │   ├── qoiz87JEwZ2.json.gz
│   │               │   │   ├── r1Q1Z4BcV1o.json.gz
│   │               │   │   ├── r47D5H71a5s.json.gz
│   │               │   │   ├── rPc6DW4iMge.json.gz
│   │               │   │   ├── s8pcmisQ38h.json.gz
│   │               │   │   ├── S9hNv5qa7GM.json.gz
│   │               │   │   ├── sKLMLpTHeUy.json.gz
│   │               │   │   ├── sT4fr6TAbpF.json.gz
│   │               │   │   ├── ULsKaCPVFJR.json.gz
│   │               │   │   ├── uNb9QFRL6hY.json.gz
│   │               │   │   ├── ur6pFq6Qu1A.json.gz
│   │               │   │   ├── Uxmj2M2itWa.json.gz
│   │               │   │   ├── V2XKFyX4ASd.json.gz
│   │               │   │   ├── VFuaQ6m2Qom.json.gz
│   │               │   │   ├── VLzqgDo317F.json.gz
│   │               │   │   ├── VVfe2KiqLaN.json.gz
│   │               │   │   ├── Vvot9Ly1tCj.json.gz
│   │               │   │   ├── vyrNrziPKCB.json.gz
│   │               │   │   ├── XcA2TqTSSAj.json.gz
│   │               │   │   ├── YmJkqBEsHnH.json.gz
│   │               │   │   └── ZMojNkEp431.json.gz
│   │               │   └── train.json.gz
│   │               ├── val
│   │               │   ├── content
│   │               │   │   ├── 2azQ1b91cZZ.json.gz
│   │               │   │   ├── 8194nk5LbLH.json.gz
│   │               │   │   ├── EU6Fwq7SyZv.json.gz
│   │               │   │   ├── oLBMNvg9in8.json.gz
│   │               │   │   ├── pLe4wQe7qrG.json.gz
│   │               │   │   ├── QUCTc6BB5sX.json.gz
│   │               │   │   ├── TbHJrupSAjP.json.gz
│   │               │   │   ├── X7HyMhZNoso.json.gz
│   │               │   │   ├── x8F5xyUWy9e.json.gz
│   │               │   │   ├── Z6MFQCViBuw.json.gz
│   │               │   │   └── zsNo4HB9uLZ.json.gz
│   │               │   └── val.json.gz
│   │               └── val_mini
│   │                   └── val_mini.json.gz
│   └── scene_datasets
│       └── mp3d
│           ├── download_mp.py
│           └── v1
│               └── scans
│                   └── 17DRP5sb8fy
│                       ├── cameras.zip
│                       ├── matterport_camera_intrinsics.zip
│                       ├── matterport_camera_poses.zip
│                       ├── matterport_color_images.zip
│                       └── tmp0PJfol
├── objectgoal_mp3d
│   └── val_mini
│       └── val_mini.json.gz
└── pointgoal_gibson_v2
    └── val_mini
        ├── content
        │   └── Pablo.json.gz
        └── val_mini.json.gz

  1. Running the evaluation script:
❯ ./test_locally_objectnav_rgbd.sh --docker-name objectnav_submission

2021-03-19 13:32:25,265 Initializing dataset ObjectNav-v1
Traceback (most recent call last):
  File "agent.py", line 41, in <module>
    main()
  File "agent.py", line 33, in main
    challenge = habitat.Challenge(eval_remote=False)
  File "/habitat-lab/habitat/core/challenge.py", line 16, in __init__
    super().__init__(config_paths, eval_remote=eval_remote)
  File "/habitat-lab/habitat/core/benchmark.py", line 38, in __init__
    self._env = Env(config=config_env)
  File "/habitat-lab/habitat/core/env.py", line 79, in __init__
    id_dataset=config.DATASET.TYPE, config=config.DATASET
  File "/habitat-lab/habitat/datasets/registration.py", line 20, in make_dataset
    return _dataset(**kwargs)  # type: ignore
  File "/habitat-lab/habitat/datasets/object_nav/object_nav_dataset.py", line 74, in __init__
    super().__init__(config)
  File "/habitat-lab/habitat/datasets/pointnav/pointnav_dataset.py", line 93, in __init__
    with gzip.open(datasetfile_path, "rt") as f:
  File "/opt/conda/envs/habitat/lib/python3.6/gzip.py", line 53, in open
    binary_file = GzipFile(filename, gz_mode, compresslevel)
  File "/opt/conda/envs/habitat/lib/python3.6/gzip.py", line 163, in __init__
    fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'habitat-challenge-data/objectnav_mp3d/val_mini/val_mini.json.gz'

Thanks for your help with this!

Intermittent Failing/Finished Status of Identical EvalAI Submissions

I submitted to ObjectNav Test-Challenge. On EvalAI, the submission turned up as Cancelled almost immediately after submitting. The submission appeared again (about 10 min later) as a second submission on the site and was Running for 229350 sec. This job ultimately ended with the status Failed with blank result, std out and std err files.

I submitted exactly the same container to the minival phase 1 time and the Test-Std phase 2 times. The minival submission completed successfully with results. The first Test-Std submission ended with a Failed status and blank result, std out and std err files. The second Test-Std submission completed successfully with results.

  1. Do you have any ideas why these submissions could be intermittently Failing?
  2. Is there a way I can try to get my submission to the Test-Challenge phase running without using my 6 total submissions since I have successfully run this submission on both of the other phases?

Clarifications about the challenge submission

Hi,

I have submitted a model to the test-challenge track.

  • The status is "Submitted". What exactly does that mean?
  • How do we know that the test challenge evaluation succeeded, and that things went as expected?
  • Given that we have upto 5 submissions, how will the best results be selected? Is it automatically based on the maximum SPL score, or do we have to somehow choose the final model?

Note: I am seeing the same "Submitted" status for the minival and test standard submissions. I am a bit concerned because I submitted one of the prior docker submissions using the aws link. I want to make sure this works as intended.

Socket timeout issue while doing evalai push

I ran the following command for evalai push
evalai push gp_nav_rgbd:latest --phase habitat19-rgbd-minival

and got the following error:

Traceback (most recent call last):
  File "/home/dchaplot/anaconda3/lib/python3.6/site-packages/urllib3/response.py", line 302, in _error_catcher
    yield
  File "/home/dchaplot/anaconda3/lib/python3.6/site-packages/urllib3/response.py", line 384, in read
    data = self._fp.read(amt)
  File "/home/dchaplot/anaconda3/lib/python3.6/http/client.py", line 449, in read
    n = self.readinto(b)
  File "/home/dchaplot/anaconda3/lib/python3.6/http/client.py", line 483, in readinto
    return self._readinto_chunked(b)
  File "/home/dchaplot/anaconda3/lib/python3.6/http/client.py", line 578, in _readinto_chunked
    chunk_left = self._get_chunk_left()
  File "/home/dchaplot/anaconda3/lib/python3.6/http/client.py", line 546, in _get_chunk_left
    chunk_left = self._read_next_chunk_size()
  File "/home/dchaplot/anaconda3/lib/python3.6/http/client.py", line 506, in _read_next_chunk_size
    line = self.fp.readline(_MAXLINE + 1)
  File "/home/dchaplot/anaconda3/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

My docker image is 7.67GB.

What might be the issue here?

PPO model training with habitat 2020 challenge config

@mathfac @dhruvbatra Hi!
Another issue on my side I am struggling with is the right training of habitat baseline ppo model with 2020 challenge configuration for pointnav task using habitat-api.

As PPO agent configuration I use the following file ppo_pointnav.yaml

BASE_TASK_CONFIG_PATH: "configs/tasks/pointnav_gib_rgbd_2020.yaml"
TRAINER_NAME: "ppo"
ENV_NAME: "NavRLEnv"
SIMULATOR_GPU_ID: 1
TORCH_GPU_ID: 1

VIDEO_OPTION: ["disk", "tensorboard"]
TENSORBOARD_DIR: "tb"
VIDEO_DIR: "video_dir"
TEST_EPISODE_COUNT: 994
EVAL_CKPT_PATH_DIR: "data/ppo_2020_checkpoints"

NUM_PROCESSES: 4

SENSORS: ["DEPTH_SENSOR"]
CHECKPOINT_FOLDER: "data/ppo_2020_checkpoints"
NUM_UPDATES: 270000
LOG_INTERVAL: 25
CHECKPOINT_INTERVAL: 2000

RL:
  PPO:
    clip_param: 0.1
    ppo_epoch: 4
    num_mini_batch: 2
    value_loss_coef: 0.5
    entropy_coef: 0.01
    lr: 2.5e-4
    eps: 1e-5
    max_grad_norm: 0.5
    num_steps: 128
    hidden_size: 512
    use_gae: True
    gamma: 0.99
    tau: 0.95
    use_linear_clip_decay: True
    use_linear_lr_decay: True
    reward_window_size: 50

And for task configuration I used the same parameters as in challenge_pointnav2020.local.rgbd.yaml file:

ENVIRONMENT:
  MAX_EPISODE_STEPS: 500
SIMULATOR:
  AGENT_0:
    SENSORS: ['RGB_SENSOR', 'DEPTH_SENSOR']
    HEIGHT: 0.88
    RADIUS: 0.18
  HABITAT_SIM_V0:
    GPU_DEVICE_ID: 0
    ALLOW_SLIDING: False
  RGB_SENSOR:
    WIDTH: 640
    HEIGHT: 360
    HFOV: 70
    POSITION: [0, 0.88, 0]
    NOISE_MODEL: "GaussianNoiseModel"
    NOISE_MODEL_KWARGS:
      intensity_constant: 0.1

  DEPTH_SENSOR:
    WIDTH: 640
    HEIGHT: 360
    HFOV: 70
    MIN_DEPTH: 0.1
    MAX_DEPTH: 10.0
    POSITION: [0, 0.88, 0]
    NOISE_MODEL: "RedwoodDepthNoiseModel"

  ACTION_SPACE_CONFIG: 'pyrobotnoisy'
  NOISE_MODEL:
    ROBOT: "LoCoBot"
    CONTROLLER: 'Proportional'
    NOISE_MULTIPLIER: 0.5

TASK:
  TYPE: Nav-v0
  SUCCESS_DISTANCE: 0.36
  SENSORS: ['POINTGOAL_SENSOR']
  POINTGOAL_SENSOR:
    GOAL_FORMAT: POLAR
    DIMENSIONALITY: 2
  GOAL_SENSOR_UUID: pointgoal
  MEASUREMENTS: ['DISTANCE_TO_GOAL', "SUCCESS", 'SPL']
  SUCCESS:
    SUCCESS_DISTANCE: 0.36

Just changed path to the train dataset (Habitat Challenge Data for Gibson (1.5 GB)):

DATASET:
  TYPE: PointNav-v1
  SPLIT: train
  DATA_PATH: data/datasets/pointnav/gibson/v1/{split}/{split}.json.gz

After runnig command python -u habitat_baselines/run.py --exp-config habitat_baselines/config/pointnav/ppo_pointnav.yaml --run-type I got the following error:

---
 The active scene does not contain semantic annotations. 
---
I0325 20:17:08.559100 8915 simulator.py:143] Loaded navmesh data/scene_datasets/gibson/Monson.navmesh
I0325 20:17:08.559392 8915 simulator.py:155] Recomputing navmesh for agent's height 0.88 and radius 0.18.
I0325 20:17:08.567361  8915 PathFinder.cpp:338] Building navmesh with 275x112 cells
I0325 20:17:08.655342  8915 PathFinder.cpp:606] Created navmesh with 137 vertices 61 polygons
I0325 20:17:08.655371  8915 Simulator.cpp:403] reconstruct navmesh successful
2020-03-25 20:17:08,720 Initializing task Nav-v0
2020-03-25 20:17:11,725 agent number of parameters: 52694149
/home/pryhoda/anaconda3/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/optim/lr_scheduler.py:122: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [3,0,0], thread: [0,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [3,0,0], thread: [1,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [3,0,0], thread: [2,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [3,0,0], thread: [3,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [0,0,0], thread: [0,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [0,0,0], thread: [1,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [0,0,0], thread: [2,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [0,0,0], thread: [3,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [2,0,0], thread: [0,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [2,0,0], thread: [1,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [2,0,0], thread: [2,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [2,0,0], thread: [3,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [1,0,0], thread: [0,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [1,0,0], thread: [1,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [1,0,0], thread: [2,0,0] Assertion `val >= zero` failed.
/pytorch/aten/src/ATen/native/cuda/MultinomialKernel.cu:243: void at::native::<unnamed>::sampleMultinomialOnce(long *, long, int, scalar_t *, scalar_t *, int, int) [with scalar_t = float, accscalar_t = float]: block: [1,0,0], thread: [3,0,0] Assertion `val >= zero` failed.
Traceback (most recent call last):
  File "habitat_baselines/run.py", line 70, in <module>
    main()
  File "habitat_baselines/run.py", line 40, in main
    run_exp(**vars(args))
  File "habitat_baselines/run.py", line 64, in run_exp
    trainer.train()
  File "/home/pryhoda/HabitatProject/habitat-api/habitat_baselines/rl/ppo/ppo_trainer.py", line 346, in train
    rollouts, current_episode_reward, running_episode_stats
  File "/home/pryhoda/HabitatProject/habitat-api/habitat_baselines/rl/ppo/ppo_trainer.py", line 181, in _collect_rollout_step
    outputs = self.envs.step([a[0].item() for a in actions])
  File "/home/pryhoda/HabitatProject/habitat-api/habitat_baselines/rl/ppo/ppo_trainer.py", line 181, in <listcomp>
    outputs = self.envs.step([a[0].item() for a in actions])
RuntimeError: CUDA error: device-side assert triggered
Exception ignored in: <bound method VectorEnv.__del__ of <habitat.core.vector_env.VectorEnv object at 0x7f8dfa79ea58>>
Traceback (most recent call last):
  File "/home/pryhoda/HabitatProject/habitat-api/habitat/core/vector_env.py", line 468, in __del__
    self.close()
  File "/home/pryhoda/HabitatProject/habitat-api/habitat/core/vector_env.py", line 350, in close
    write_fn((CLOSE_COMMAND, None))
  File "/home/pryhoda/anaconda3/envs/habitat/lib/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/pryhoda/anaconda3/envs/habitat/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes
    self._send(header + buf)
  File "/home/pryhoda/anaconda3/envs/habitat/lib/python3.6/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

I am wondering if PPO agent is not adapted to train with 2020 challenge config (it run ok for me with 2019 challenge config - pointnav_gibson_rgbd.yaml ) or it is some issues on my side ? Thanks in advance!

Configuration clarifications

I have a couple of questions regarding the configurations:

  • A new file has been added: configs/challenge_pointnav2020_v2.local.rgbd.yaml which has a different sensor orientation. What is this for? Should we re-train our agents with this new configuration?
  • What is the config file configs/challenge_pointnav2020.local.rgbd_test_scene.yaml for? It is using v1 data. Should it not be v2?

API verison

Hi,

I found the API of habitat-challenge is different from the current version habitat-lab, can you specify which branch or version of habitat-lab shall I install to use habitat-challenge?

Thanks,

Unable to make new submissions on the EvalAI site

For the past two days I have been unable to make new submissions to the EvalAI site. I even re-cloned the repo, cleared my docker caches and tried to resubmit the random baseline agent with no success.

Previously I was able to make submissions, so I am wondering if the issue is on the EvalAI platform?

I created a topic on their forum but I have had no response yet, if you are able to confirm that the platform is still working that would be much appreciated. (for example by submitting a random agent)

Remote evaluations fail

Hi, I have made a couple of remote submissions recently (pointnav test-std) but they all fail. I have seen two types of failure cases, both happened multiple times.

1, script fails after a few hours, and it produces an stderr output with the following error:

Traceback (most recent call last):
  File "agent.py", line 172, in <module>
    main()
  File "agent.py", line 168, in main
    challenge.submit(agent)
  File "/habitat-api/habitat/core/challenge.py", line 19, in submit
    metrics = super().evaluate(agent)
  File "/habitat-api/habitat/core/benchmark.py", line 171, in evaluate
    return self.remote_evaluate(agent, num_episodes)
  File "/habitat-api/habitat/core/benchmark.py", line 103, in remote_evaluate
    SerializedEntity=pack_for_grpc(action)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/grpc/_channel.py", line 826, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/grpc/_channel.py", line 729, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1599858112.185563913","description":"Error received from peer ipv4:127.0.0.1:8085","file":"src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Socket closed","grpc_status":14}"

2, the script fails after 42 hours and it produces empty stdout and stderr files.

Submission failed on test standard

Hi,
Our submission keeps failing on test standard with "standard_error", which I am assuming is because of time limit. The same docker images takes 14mins for remote evaluation on val_mini. Assuming test standard is 33-66 times val_mini (1000-2000 episodes vs 30 episodes), our submission should take 8-16 hrs on test standard. But the submission ran for 42 hrs and failed. Is there any way to get more information such as how many episodes were completed and left, average length of episodes, the intermediate scores when the submission timed out, etc?

Any other suggestions on how to figure out the problem would be helpful.

Thanks

PPO model evaluation with docker

Hi!

I trained PPO agent model from habitat_baselines folder on pointnav task with gibson dataset. I wanted to evaluate trained model locally using docker. I created the following ppo agent submission script:

import argparse
import habitat
import random
import numpy
import os

from habitat.config import Config
from habitat.config.default import get_config
from habitat_baselines.agents.ppo_agents import PPOAgent


def get_default_config():
    c = Config()
    c.INPUT_TYPE = "rgbd" 
    c.MODEL_PATH = "models/ckpt.199.pth"  # my trained model 
    c.RESOLUTION = 256
    c.HIDDEN_SIZE = 512
    c.RANDOM_SEED = 7
    c.PTH_GPU_ID = 1
    c.GOAL_SENSOR_UUID = "pointgoal"
    return c


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--evaluation", type=str, required=True, choices=["local", "remote"])
    args = parser.parse_args()

    agent_config = get_default_config()
    agent = PPOAgent(agent_config)

    if args.evaluation == "local":
        challenge = habitat.Challenge(eval_remote=False)
    else:
        challenge = habitat.Challenge(eval_remote=True)

    challenge.submit(agent)


if __name__ == "__main__":
    main()

I built the docker and after the command "sudo ./test_locally_pointnav_rgbd.sh --docker-name ppo_submission" got the following error:

2020-03-11 18:50:08,104 Initializing dataset PointNav-v1
2020-03-11 18:50:08,108 initializing sim Sim-v0
2020-03-11 18:50:09,020 Initializing task Nav-v0
Traceback (most recent call last):
  File "ppo_agent.py", line 41, in <module>
    main()
  File "ppo_agent.py", line 37, in main
    challenge.submit(agent)
  File "/habitat-api/habitat/core/challenge.py", line 19, in submit
    metrics = super().evaluate(agent)
  File "/habitat-api/habitat/core/benchmark.py", line 159, in evaluate
    return self.local_evaluate(agent, num_episodes)
  File "/habitat-api/habitat/core/benchmark.py", line 133, in local_evaluate
    action = agent.act(observations)
  File "/habitat-api/habitat_baselines/agents/ppo_agents.py", line 134, in act
    deterministic=False,
  File "/habitat-api/habitat_baselines/rl/ppo/policy.py", line 40, in act
    observations, rnn_hidden_states, prev_actions, masks
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/habitat-api/habitat_baselines/rl/ppo/policy.py", line 167, in forward
    perception_embed = self.visual_encoder(observations)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/habitat-api/habitat_baselines/rl/models/simple_cnn.py", line 147, in forward
    return self.cnn(cnn_input)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch-1.4.0-py3.6-linux-x86_64.egg/torch/nn/functional.py", line 1370, in linear
    ret = torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch, m1: [1 x 99712], m2: [25088 x 512] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:290

I am wondering what could cause such error? Please help)

Cancelling submission to evalai

Hi,
Is there any way to cancel a submission to evalai? I am not sure if our submission got uploaded correctly and would like to cancel and resubmit to make sure.

is that possible to submit for test standard or minival now

hi, I try to submit my result for minival, the status shows "submitted", but execution time and the result are None. I guess there may be some mistakes in my docker. So I follow Readme to submit the original agent and ppo agent. The status is "submitted" and others show "None".

My question is that is it possible to submit for test or minival. If so, any suggestions for my problems.

Thanks a lot.

Evaluation time-limits

How strictly will the model evaluation time be taken into consideration? I can imagine some models taking several hours rather than 30 minutes to complete 1000+ episodes.

Challenge resolution

Hello, thank you for creating this years challenge!

Regarding the resolution, for object-nav the config file has a resolution of WIDTH: 640, HEIGHT: 480. Will it be possible to use lower resolutions for the challenge? I believe this was available last year but wish to confirm.

Thank you.

test-std Phase Submission Error (code runs successfully on minival)

I submitted my code successfully to the minival challenge phase:

evalai push [my docker image] --phase habitat21-objectnav-minival --private

My submission completed successfully, and I could view my results.

However, when I submit to the test-std challenge phase with the below command, I get the error copied at the bottom of my post, and the status of my submission is "Failed".

evalai push [my docker image] --phase habitat21-objectnav-test-std --private

I saw this same error on the minival phase the first time I submitted. I thought that the error meant that my code ran for too long, and the job was killed because I timed out. I sped up my code, resubmitted, and the error went away. Then, my submission successfully completed on the minival phase.

This error happens after my code runs for 30 minutes on the test-std phase. I previously saw this error after 30 minutes running on the minival phase. My understanding is that our submissions on the test-std phase have 48 hours to complete. There are no submissions visible yet on the test-std public leaderboard, so I do not know if anyone else ran their code successfully on this phase.

In summary, my questions are:

  1. Does this error indicate a timeout for my job?
  2. If so, how long does our code have to run on the test-std phase?
  3. If not, do you have any insight on what this error indicates?

Thank you for any help you can provide!

Error message:

Traceback (most recent call last):
File "challenge_agent.py", line 308, in
main()
File "challenge_agent.py", line 304, in main
challenge.submit(agent)
File "/habitat-lab/habitat/core/challenge.py", line 19, in submit
metrics = super().evaluate(agent)
File "/habitat-lab/habitat/core/benchmark.py", line 163, in evaluate
return self.remote_evaluate(agent, num_episodes)
File "/habitat-lab/habitat/core/benchmark.py", line 93, in remote_evaluate
SerializedEntity=pack_for_grpc(action)
File "/opt/conda/envs/habitat/lib/python3.6/site-packages/grpcio-1.36.0rc1-py3.6-linux-x86_64.egg/grpc/_channel.py", line 923, in call
return _end_unary_response_blocking(state, call, False, None)
File "/opt/conda/envs/habitat/lib/python3.6/site-packages/grpcio-1.36.0rc1-py3.6-linux-x86_64.egg/grpc/_channel.py", line 826, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = "{"created":"@1620152076.442809572","description":"Error received from peer ipv4:127.0.0.1:8085","file":"src/core/lib/surface/call.cc","file_line":1067,"grpc_message":"Socket closed","grpc_status":14}"

EGL errors in docker

I followed the instructions to build the local docker file.

docker build . --file Pointnav_DDPPO_baseline.Dockerfile -t pointnav_submission_debug

It built successfully, but local testing via ./test_locally_pointnav_rgbd.sh resulted in the following error:

Neither `ifconfig` (`ifconfig -a`) nor `ip` (`ip address show`) commands are available, listing network interfaces is likely to fail
2020-05-05 07:03:09,730 Overwriting CNN input size of depth: (256, 256)
2020-05-05 07:03:09,731 Overwriting CNN input size of rgb: (256, 256)
2020-05-05 07:03:12,762 Model checkpoint wasn't loaded, evaluating a random model.
2020-05-05 07:03:12,777 Initializing dataset PointNav-v1
2020-05-05 07:03:12,779 initializing sim Sim-v0
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0505 07:03:12.791268    16 WindowlessContext.cpp:114] Check failed: eglDevId < numDevices [EGL] Could not find an EGL device for CUDA device 0
*** Check failure stack trace: ***
submission.sh: line 3:    16 Aborted                 (core dumped) python agent.py --evaluation $AGENT_EVALUATION_TYPE $@

I created an interactive session inside the docker via:

docker run -v /tmp/habitat-challenge-data:/habitat-challenge-data --runtime=nvidia -it pointnav_submission_debug /bin/bash

nvidia-smi worked:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.116.00   Driver Version: 418.116.00   CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro GP100        Off  | 00000000:81:00.0 Off |                    0 |
| 26%   37C    P0    31W / 235W |      0MiB / 16278MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro GP100        Off  | 00000000:82:00.0 Off |                    0 |
| 26%   37C    P0    29W / 235W |      0MiB / 16278MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Running a simple pytorch code on the GPU also worked:

>>> python -c "import torch, torch.nn as nn; device = torch.device('cuda:0'); model = nn.Linear(4, 2); model.to(device);  x = torch.randn(1, 4).to(device); y = model(x); print(y)"

tensor([[0.0405, 0.1198]], device='cuda:0', grad_fn=<AddmmBackward>)

turn angle 30 degrees

The configuration files provided in configs/challenge_pointnav2021.local.rgbd.yaml has
TURN_ANGLE: 30
but previous baselines like the DD-PPO paper use 10 degrees. I assume that for this challenge we have to use 30 right? The evaluation is performed with the same parameters as in configs/challenge_pointnav2021.local.rgbd.yaml ?

thanks

Docker installation with data softlink bug fixed?

Hi there,

I found that in current ReadMe file local evaluation step 5, if a soft link is used and the suggested docker command is run,
there would be an error message like this:

 2021-04-13 21:51:15,680 Initializing dataset ObjectNav-v1
 2021-04-13 21:51:15,710 initializing sim Sim-v0
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 E0413 21:51:15.723701    17 StageAttributesManager.cpp:90] StageAttributesManager::registerObjectFinalize : Render asset template handle : habitat-challenge-data/data/scene_datasets/mp3d/x8F5xyUWy9e/x8F5xyUWy9e.glb specified in stage template with handle : habitat-challenge-data/data/scene_datasets/mp3d/x8F5xyUWy9e/x8F5xyUWy9e.glb does not correspond to any existing file or primitive render asset.  Aborting. 
  submission.sh: line 3:    17 Segmentation fault      (core dumped) python agent.py --evaluation $AGENT_EVALUATION_TYPE $@`

My fix was to change this line:

 -v $(realpath habitat-challenge-data/data/scene_datasets/mp3d) \

to

  -v $(realpath habitat-challenge-data/data/scene_datasets/mp3d):/habitat-challenge-data/data/scene_datasets/mp3d \

Not that familiar with Docker, so not sure whether this is a version problem or a bug.
Just want to post one solution!

Best,
Yiqing

Potential issue with shortest path or success criteria in ObjectNav

Hi,
I was evaluating the shortest path for the ObjectNav task. The second and third episode on val_mini seem to fail with the shortest path.

Looking at the path, it seems like the path is correct. The agent is navigating towards an instance of the target category, maybe there's some issue with success criteria?

Are the 2021 competition fairembodied/habitat-challenge docker images all published?

Troubleshooting why I could not run and try your 2021 docker images for the habitat-challenge today and it seems like they might not have been published yet on docker hub? Or am I missing something? I'm able to docker pull fairembodied/habitat-challenge:latest but trying to build from the dockerfiles and that tag causes errors.

docker pull fairembodied/habitat-challenge:testing_2021_habitat_base_docker

Error response from daemon: manifest for fairembodied/habitat-challenge:testing_2021_habitat_base_docker not found: manifest unknown: manifest unknown

thanks.

Time limit clarifications

Hi,

Few clarification questions about runtime limits for the pointnav task

  • What is the time limit for test-standard and test-challenge phase? The README mentions 24 hours, but i have noticed in the logs "Timelimit for test-std, test-challenge track is 151200 seconds" (42 hours)

  • What is the number of evaluation episodes?

  • Unfortunately my solution is currently rather slow, taking around 2 minutes per episode on average. 2000 episodes would take around 66 hours. Is it possible to extend the time limit?

  • I have also noticed that if my script does not terminate within the time limit there is no output generated and the submission is marked failed. Would it be possible to generate results, maybe treating any remaining episodes as failed? I have done something like this in my script (always take stop action if total execution time is beyond limit), but this limits my runtime to a specified amount, and I cannot be sure how much extra time will be spent on loading remaining episodes

thank you

CUDA out of memory error

I am getting the following error on both my minival and test-std submissions:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.75 GiB total capacity; 146.35 MiB already allocated; 22.00 MiB free; 11.65 MiB cached)

Just to give context, my local runs take up < 4GB memory for evaluation. The above error also seems to suggest that only 146.35 + 22 + 11.65 MiB memory was available to begin with. Is it possible that multiple users are assigned to the same GPUs?

About online evaluations

I submitted a docker image to the val_mini phase and test_standard phase a few hours ago.

It ran well locally on val_mini, but the server doesn't have any result yet for both phases. What may cause the problem?

BTW, do the submissions wait in a pending queue for evaluation or be evaluated immediately?

Thanks

Path error

Our instructions say:

a) PoinNav: Download Gibson scenes used for Habitat Challenge. Accept terms here and select the download corresponding to “Habitat Challenge Data for Gibson (1.5 GB)“. Place this data in: habitat-challenge/habitat-challenge-data/gibson

b) ObjectNav: Download Matterport3D scenes used for Habitat Challenge here. Place this data in: habitat-challenge/habitat-challenge-data/mp3d

But our scripts are looking for habitat-challenge/habitat-challenge-data/scene_datasets/mp3d, as you can see from these errors:

(base) dbatra@ubuntu-bionic-1:~/habitat-challenge$ ./test_locally_objectnav_rgbd.sh --docker-name my_submission
2020-03-05 03:15:30,744 Initializing dataset ObjectNav-v1
2020-03-05 03:15:30,776 initializing sim Sim-v0
WARNING: Logging before InitGoogleLogging() is written to STDERR
E0305 03:15:31.245062    16 ResourceManager.cpp:84] Cannot load from file habitat-challenge-data/scene_datasets/mp3d/pLe4wQe7qrG/pLe4wQe7qrG.glb
E0305 03:15:31.249548    16 Simulator.cpp:106] cannot load habitat-challenge-data/scene_datasets/mp3d/pLe4wQe7qrG/pLe4wQe7qrG.glb

If I create appropriate symlinks, this error goes away:

(base) dbatra@ubuntu-bionic-1:~/habitat-challenge$ ./test_locally_objectnav_rgbd.sh --docker-name my_submission
2020-03-05 03:36:44,704 Initializing dataset ObjectNav-v1
2020-03-05 03:36:44,733 initializing sim Sim-v0
2020-03-05 03:36:48,449 Initializing task ObjectNav-v1
2020-03-05 03:36:49,552 distance_to_goal: 5.235198982556661
2020-03-05 03:36:49,552 spl: 0.0

But we should fix this by checking for the dataset in the path we say it should be in.

one question about baseline

I know this challenge is over, but I want to have a try.
I try the script for local evaluation with rgb.pth, I follow the instruction without changes. I find the result I run is 0.4626, but the result from the leaderboard is 0.47. Is that the same model between the repo and the leaderboard?

Remote evaluation on "submitted" from 5 days

Hi, I've made a submission for the ObjectNav task in the test-std phase. Currently is in "submitted" state for about 5 days. Is this due to a switch from the 2020 Challenge to the 2021 Challenge? Or are there some problems with the remote server?

Thanks,
Tommaso

EvalAI Submission Instructions

In the submission instructions it states that valid phases are: habitat21-{pointnav, objectnav}-{minival, test-std, test-ch}. However, on the eval.ai submission website, the submission phases are listed in the format: habitat-objectnav-minival-2021-802.

When trying to submit with the github instructions, I get the error:

Error: Challenge phase with slug habitat21-objectnav-minival does not exist

When trying to submit with the phases listed in the eval.ai website, I get the error:

Error: Sorry, cannot accept submissions since challenge phase is not active

Note that I can successfully make 2020 minival submissions on eval.ai.

Are the correct phases activated? Also, what is the correct phase naming convention?

Thank you!

Cannot run tests, wrong checkpoint? torch.Size([128, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([82, 1024, 3, 3]).

I followed these instructions on ubuntu:
https://colab.research.google.com/gist/mathfac/8c9b97d7afef36e377f17d587c903ede#scrollTo=Qo_z277BAueV

But when I run the test I get:

(base) ggg:~/habitat/habitat-challenge$ bash ./test_locally_pointnav_rgbd.sh --docker-name ddppo_pointnav_submission
Neither `ifconfig` (`ifconfig -a`) nor `ip` (`ip address show`) commands are available, listing network interfaces is likely to fail
Checkpoint loaded: demo.ckpt.pth
/habitat-lab/habitat_baselines/config/default.py:214: UserWarning: NUM_PROCESSES is depricated and will be removed in a future version.  Use NUM_ENVIRONMENTS instead.  Overwriting NUM_ENVIRONMENTS with NUM_PROCESSES for backwards compatibility.
  "NUM_PROCESSES is depricated and will be removed in a future version."
Traceback (most recent call last):
  File "agent.py", line 200, in <module>
    main()
  File "agent.py", line 189, in main
    agent = DDPPOAgent(config)
  File "agent.py", line 124, in __init__
    for k, v in ckpt["state_dict"].items()
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/torch/nn/modules/module.py", line 847, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PointNavResNetPolicy:
	Missing key(s) in state_dict: "net.pointgoal_embedding.weight", "net.pointgoal_embedding.bias". 
	Unexpected key(s) in state_dict: "net.tgt_embeding.weight", "net.tgt_embeding.bias". 
	size mismatch for net.visual_encoder.compression.0.weight: copying a param with shape torch.Size([128, 1024, 3, 3]) from checkpoint, the shape in current model is torch.Size([82, 1024, 3, 3]).
	size mismatch for net.visual_encoder.compression.1.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([82]).
	size mismatch for net.visual_encoder.compression.1.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([82]).
	size mismatch for net.visual_fc.1.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([512, 2050]).

Any suggestion?

"standard_error" on submission

Hi,
My submission to the evalai portal failed and it is showing just "standard_error" in the Stderr File.
Could you help in figuring out the error?

Unable to run PointNav agent locally

While running the "test_locally_pointnav_rgbd.sh" script, I get the following error:

2020-03-24 10:15:01,170 Initializing dataset PointNav-v1
2020-03-24 10:15:01,171 initializing sim Sim-v0
2020-03-24 10:15:02,310 Initializing task Nav-v0
Traceback (most recent call last):
  File "agent.py", line 26, in <module>
    main()
  File "agent.py", line 21, in main
    challenge = habitat.Challenge()
  File "/habitat-api/habitat/core/challenge.py", line 16, in __init__
    super().__init__(config_paths)
  File "/habitat-api/habitat/core/benchmark.py", line 30, in __init__
    self._env = Env(config=config_env)
  File "/habitat-api/habitat/core/env.py", line 102, in __init__
    dataset=self._dataset,
  File "/habitat-api/habitat/tasks/registration.py", line 21, in make_task
    return _task(**kwargs)
  File "/habitat-api/habitat/tasks/nav/nav.py", line 968, in __init__
    super().__init__(config=config, sim=sim, dataset=dataset)
  File "/habitat-api/habitat/core/embodied_task.py", line 241, in __init__
    entities_config=config,
  File "/habitat-api/habitat/core/embodied_task.py", line 269, in _init_entities
    entity_type = register_func(entity_cfg.TYPE)
  File "/opt/conda/envs/habitat/lib/python3.6/site-packages/yacs/config.py", line 141, in __getattr__
    raise AttributeError(name)
AttributeError: TYPE

When I traced the error, I found that the task type specified in "challenge_pointnav2020.local.rgbd.yaml" file is not being read by yacs. Please let me know how I can fix this problem.

Last minute cancellation request

Could you kill my ongoing submission for test-challenge phase please so i can submit an update policy?
User karkus, Team: DAN, track: pointgoal; phase=test-ch.
Thank you!

Submission status

Hi @abhiskk !
I made submissions for point nav minival and test-standart phases about 24 hours ago (UCULab team) and the status for those submissions is still "Running". What could cause such a long evaluation ?

Inconsistent scores between Local and Remote minival for PointNav

I thought it was better to raise a separate issue for this. I'm observing large inconsistencies in the local and remote submissions. I made 3 submissions of the same model on the minival track. I also evaluated the docker on a local server. These were the results I got:

Local docker evaluation

SPL Success Distance to goal
Trial 1 0.170 0.244 2.630
Trial 2 0.150 0.206 2.491
Trial 3 0.172 0.234 2.486

Remote docker evaluation

SPL Success Distance to goal
Trial 1 0.260 0.366 0.895
Trial 2 0.258 0.333 0.846
Trial 3 0.336 0.433 0.870

I additionally evaluated locally on a ppo_trainer evaluation script I wrote based in habitat-baselines. I've tried to keep things as consistent as possible between my docker submission and the ppo_trainer script.

Local non-docker evaluation

SPL Success Distance to goal
Trial 1 0.327 0.455 1.504
Trial 2 0.350 0.492 1.503
Trial 3 0.333 0.455 1.622

Could these inconsistencies be due to some random seed issues? I've set them to 123 as follows:

random.seed(config.PYT_RANDOM_SEED)
np.random.seed(config.PYT_RANDOM_SEED)
torch.random.manual_seed(config.PYT_RANDOM_SEED)
if torch.cuda.is_available():
   torch.backends.cudnn.deterministic = True
   torch.backends.cudnn.benchmark = False

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.