Giter Club home page Giter Club logo

nice-slam's Introduction

NICE-SLAM: Neural Implicit Scalable Encoding for SLAM

Zihan Zhu* · Songyou Peng* · Viktor Larsson · Weiwei Xu · Hujun Bao
Zhaopeng Cui · Martin R. Oswald · Marc Pollefeys

(* Equal Contribution)

CVPR 2022

Logo

NICE-SLAM produces accurate dense geometry and camera tracking on large-scale indoor scenes.

(The black / red lines are the ground truth / predicted camera trajectory)



Table of Contents
  1. Installation
  2. Visualization
  3. Demo
  4. Run
  5. iMAP*
  6. Evaluation
  7. Acknowledgement
  8. Citation
  9. Contact

Installation

First you have to make sure that you have all dependencies in place. The simplest way to do so, is to use anaconda.

You can create an anaconda environment called nice-slam. For linux, you need to install libopenexr-dev before creating the environment.

sudo apt-get install libopenexr-dev
    
conda env create -f environment.yaml
conda activate nice-slam

Visualizing NICE-SLAM Results

We provide the results of NICE-SLAM ready for download. You can run our interactive visualizer as following.

Self-captured Apartment

To visualize our results on the self-captured apartment, as shown in the teaser:

bash scripts/download_vis_apartment.sh
python visualizer.py configs/Apartment/apartment.yaml --output output/vis/Apartment

Note for users from China: If you encounter slow speed in downloading, check in all the scripts/download_*.sh scripts, where we also provide the 和彩云 links for you to download manually.

ScanNet

bash scripts/download_vis_scene0000.sh
python visualizer.py configs/ScanNet/scene0000.yaml --output output/vis/scannet/scans/scene0000_00

You can find the results of NICE-SLAM on other scenes in ScanNet here.

Replica

bash scripts/download_vis_room1.sh
python visualizer.py configs/Replica/room1.yaml --output output/vis/Replica/room1

You can find the results of NICE-SLAM on other scenes in Replica here.

Interactive Visualizer Usage

The black trajectory indicates the ground truth trajectory, abd the red is trajectory of NICE-SLAM.

  • Press Ctrl+0 for grey mesh rendering.
  • Press Ctrl+1 for textured mesh rendering.
  • Press Ctrl+9 for normal rendering.
  • Press L to turn off/on lighting.

Command line arguments

  • --output $OUTPUT_FOLDER output folder (overwrite the output folder in the config file)
  • --input_folder $INPUT_FOLDER input folder (overwrite the input folder in the config file)
  • --save_rendering save rendering video to vis.mp4 in the output folder
  • --no_gt_traj do not show ground truth trajectory
  • --imap visualize results of iMAP*
  • --vis_input_frame opens up a viewer to show input frames. Note: you need to download the dataset first. See the Run section below.

Demo

Here you can run NICE-SLAM yourself on a short ScanNet sequence with 500 frames.

First, download the demo data as below and the data is saved into the ./Datasets/Demo folder.

bash scripts/download_demo.sh

Next, run NICE-SLAM. It takes a few minutes with ~5G GPU memory.

python -W ignore run.py configs/Demo/demo.yaml

Finally, run the following command to visualize.

python visualizer.py configs/Demo/demo.yaml 

NOTE: This is for demonstration only, its configuration/performance may be different from our paper.

Run

Self-captured Apartment

Download the data as below and the data is saved into the ./Datasets/Apartment folder.

bash scripts/download_apartment.sh

Next, run NICE-SLAM:

python -W ignore run.py configs/Apartment/apartment.yaml

ScanNet

Please follow the data downloading procedure on ScanNet website, and extract color/depth frames from the .sens file using this code.

[Directory structure of ScanNet (click to expand)]

DATAROOT is ./Datasets by default. If a sequence (sceneXXXX_XX) is stored in other places, please change the input_folder path in the config file or in the command line.

  DATAROOT
  └── scannet
      └── scans
          └── scene0000_00
              └── frames
                  ├── color
                  │   ├── 0.jpg
                  │   ├── 1.jpg
                  │   ├── ...
                  │   └── ...
                  ├── depth
                  │   ├── 0.png
                  │   ├── 1.png
                  │   ├── ...
                  │   └── ...
                  ├── intrinsic
                  └── pose
                      ├── 0.txt
                      ├── 1.txt
                      ├── ...
                      └── ...

Once the data is downloaded and set up properly, you can run NICE-SLAM:

python -W ignore run.py configs/ScanNet/scene0000.yaml

Replica

Download the data as below and the data is saved into the ./Datasets/Replica folder. Note that the Replica data is generated by the authors of iMAP, so please cite iMAP if you use the data.

bash scripts/download_replica.sh

and you can run NICE-SLAM:

python -W ignore run.py configs/Replica/room0.yaml

The mesh for evaluation is saved as $OUTPUT_FOLDER/mesh/final_mesh_eval_rec.ply, where the unseen regions are culled using all frames.

TUM RGB-D

Download the data as below and the data is saved into the ./Datasets/TUM-RGBD folder

bash scripts/download_tum.sh

Now run NICE-SLAM:

python -W ignore run.py configs/TUM_RGBD/freiburg1_desk.yaml

Co-Fusion

First, download the dataset. This script should download and unpack the data automatically into the ./Datasets/CoFusion folder.

bash scripts/download_cofusion.sh

Run NICE-SLAM:

python -W ignore run.py configs/CoFusion/room4.yaml

Use your own RGB-D sequence from Kinect Azure

[Details (click to expand)]
  1. Please first follow this guide to record a sequence and extract aligned color and depth images. (Remember to use --align_depth_to_color for azure_kinect_recorder.py)

    DATAROOT is ./Datasets in default, if a sequence (sceneXX) is stored in other places, please change the "input_folder" path in the config file or in the command line.

      DATAROOT
      └── Own
          └── scene0
              ├── color
              │   ├── 00000.jpg
              │   ├── 00001.jpg
              │   ├── 00002.jpg
              │   ├── ...
              │   └── ...
              ├── config.json
              ├── depth
              │   ├── 00000.png
              │   ├── 00001.png
              │   ├── 00002.png
              │   ├── ...
              │   └── ...
              └── intrinsic.json
    
    
  2. Prepare .yaml file based on the configs/Own/sample.yaml. Change the camera intrinsics in the config file based on intrinsic.json. You can also get the intrinsics of the depth camera via other tools such as MATLAB.

  3. Specify the bound of the scene. If no ground truth camera pose is given, we construct world coordinates on the first frame. The X-axis is from left to right, Y-axis is from down to up, Z-axis is from front to back.

  4. Change the input_folder path and/or the output path in the config file or the command line.

  5. Run NICE-SLAM.

python -W ignore run.py configs/Own/sample.yaml

(Optional but highly Recommended) If you don't want to specify the bound of the scene or manually change the config file. You can first run the Redwood tool in Open3D and then run NICE-SLAM. Here we provide steps for the whole pipeline, beginning from recording Azure Kinect videos. (Ubuntu 18.04 and above is recommended.)

  1. Download the Open3D repository.
bash scripts/download_open3d.sh
  1. Record and extract frames.
# specify scene ID
sceneid=0
cd 3rdparty/Open3D-0.13.0/examples/python/reconstruction_system/
# record and save to .mkv file
python sensors/azure_kinect_recorder.py --align_depth_to_color --output scene$sceneid.mkv
# extract frames
python sensors/azure_kinect_mkv_reader.py --input  scene$sceneid.mkv --output dataset/scene$sceneid
  1. Run reconstruction.
python run_system.py dataset/scene$sceneid/config.json --make --register --refine --integrate 
# back to main folder
cd ../../../../../
  1. Prepare the config file.
python src/tools/prep_own_data.py --scene_folder 3rdparty/Open3D-0.13.0/examples/python/reconstruction_system/dataset/scene$sceneid --ouput_config configs/Own/scene$sceneid.yaml
  1. Run NICE-SLAM.
python -W ignore run.py configs/Own/scene$sceneid.yaml

iMAP*

We also provide our re-implementation of iMAP (iMAP*) for use. If you use the code, please cite both the original iMAP paper and NICE-SLAM.

Usage

iMAP* shares a majority part of the code with NICE-SLAM. To run iMAP*, simply use *_imap.yaml in the config file and also add the argument --imap in the command line. For example, to run iMAP* on Replica room0:

python -W ignore run.py configs/Replica/room0_imap.yaml --imap 

To use our interactive visualizer:

python visualizer.py configs/Replica/room0_imap.yaml --imap 

To evaluate ATE:

python src/tools/eval_ate.py configs/Replica/room0_imap.yaml --imap 
[Differences between iMAP* and the original iMAP (click to expand)]

Keyframe pose optimization during mapping

We do not optimize the selected keyframes' poses for iMAP*, because optimizing them usually leads to worse performance. One possible reason is that since their keyframes are selected globally, and many of them do not have overlapping regions especially when the scene gets larger. Overlap is a prerequisite for bundle adjustment (BA). For NICE-SLAM, we only select overlapping keyframes within a small window (local BA), which works well in all scenes. You can still turn on the keyframe pose optimization during mapping for iMAP* by enabling BA in the config file.

Active sampling

We disable the active sampling in iMAP*, because in our experiments we observe that it does not help to improve the performance while brings additional computational overhead.

For the image active sampling, in each iteration the original iMAP uniformly samples 200 pixels in the entire image. Next, they divide this image into an 8x8 grid and calculate the probability distribution from the rendering losses. This means that if the resolution of an image is 1200x680 (Replica), only around 3 pixels are sampled to calculate the distribution for a 150x85 grid patch. This is not too much different from simple uniform sampling. Therefore, during mapping we use the same pixel sampling strategy as NICE-SLAM for iMAP*: uniform sampling, but even 4x more pixels than reported in the iMAP paper.

For the keyframe active sampling, the original iMAP requires rendering depth and color images for all keyframes to get the loss distribution, which is expensive and we again did not find it very helpful. Instead, as done in NICE-SLAM, iMAP* randomly samples keyframes from the keyframe list. We also let iMAP* optimize for 4x more iterations than NICE-SLAM, but their performance is still inferior.

Keyframe selection

For fair comparison, we use the same keyframe selection method in iMAP* as in NICE-SLAM: add one keyframe to the keyframe list every 50 frames.

Evaluation

Average Trajectory Error

To evaluate the average trajectory error. Run the command below with the corresponding config file:

python src/tools/eval_ate.py configs/Replica/room0.yaml

Reconstruction Error

To evaluate the reconstruction error, first download the ground truth Replica meshes where unseen region have been culled.

bash scripts/download_cull_replica_mesh.sh

Then run the command below (same for NICE-SLAM and iMAP*). The 2D metric requires rendering of 1000 depth images, which will take some time (~9 minutes). Use -2d to enable 2D metric. Use -3d to enable 3D metric.

# assign any output_folder and gt mesh you like, here is just an example
OUTPUT_FOLDER=output/Replica/room0
GT_MESH=cull_replica_mesh/room0.ply
python src/tools/eval_recon.py --rec_mesh $OUTPUT_FOLDER/mesh/final_mesh_eval_rec.ply --gt_mesh $GT_MESH -2d -3d

We also provide code to cull the mesh given camera poses. Here we take culling of ground truth mesh of Replica room0 as an example.

python src/tools/cull_mesh.py --input_mesh Datasets/Replica/room0_mesh.ply --traj Datasets/Replica/room0/traj.txt --output_mesh cull_replica_mesh/room0.ply
[For iMAP* evaluation (click to expand)]

As discussed in many recent papers, e.g. UNISURF/VolSDF/NeuS, manual thresholding the volume density during marching cubes might be needed. Moreover, we find out there exist scaling differences, possibly because of the reason discussed in NeuS. Therefore, ICP with scale is needed. You can use the ICP tool in CloudCompare with default configuration with scaling enabled.

Acknowledgement

We adapted some codes from some awesome repositories including convolutional_occupancy_networks, nerf-pytorch, lietorch, and DIST-Renderer. Thanks for making codes public available. We also thank Edgar Sucar for allowing us to make the Replica Dataset available.

Citation

If you find our code or paper useful, please cite

@inproceedings{Zhu2022CVPR,
  author    = {Zhu, Zihan and Peng, Songyou and Larsson, Viktor and Xu, Weiwei and Bao, Hujun and Cui, Zhaopeng and Oswald, Martin R. and Pollefeys, Marc},
  title     = {NICE-SLAM: Neural Implicit Scalable Encoding for SLAM},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2022}
}

Contact

Contact Zihan Zhu and Songyou Peng for questions, comments and reporting bugs.

nice-slam's People

Contributors

pengsongyou avatar zzh2000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nice-slam's Issues

marching_cubes error. Possibly no surface extracted from the level set.

Hi, thanks for your great work!

When test the demo data, I meet the marching_cubes error, and there is nothing saved in output/Demo/mesh. I'd like to know why I met this problem. Looking forward to your reply.

By the way, when I test the Demo scan, it costs more than 1 hour on single 2080Ti GPU rather than several minutes as you mentioned in README. Is this runtime normal?

about camera pose

hi, can you explain why the camera pose[:3, 1:2] should be multiply by -1?
e.g. in src/utils/datasets.py
c2w[:3, 1] *= -1.0
c2w[:3, 2] *= -1.0

No coarse decorder is used in tracker

Hi, thank you for such an excellent work.
The paper says due to the coarse feature grid, "This extrapolated geometry provides a meaningful signal for the tracking as the camera moves into previously unobserved areas. Making it more robust to sudden frame loss or fast camera movement."
But, When I read the code, I find in Tracker.py, it just use color stage decorde to compute the loss and update camera pose. No coarse grid and decorder is fined.
I would appreciate it if you could answer this question

Images in mapping_vis shown the model learned nothing

Hi @pengsongyou ,

Thank you for your great work! I've followed the instruction to download apartment data with cmd line
bash scripts/download_apartment.sh
then run NICE-SLAM with:
python -W ignore run.py configs/Apartment/apartment.yaml

after several iteration, I check the output images in output/Apartment/mapping_vis/ and I found something weird, please see the image below:

01700_0030

It seems to me that model learned nothing, any idea?

Looking forward to your reply!

Is there a need for large CUDA memory?

Hi, I tried to run the nice-slam on TUM_GRBD freiburg1_desk sequence according to the readme file, but got CUDA error: out of memory.
RuntimeError: CUDA out of memory. Tried to allocate 86.00 MiB (GPU 0; 5.80 GiB total capacity; 686.18 MiB already allocated; 98.94 MiB free; 780.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
My device is 3060 laptop 6G
Driver Version: 470.63.01
CUDA Version: 11.4

[Announcements] Code release

If you wish to be notified of the code release, hit the subscribe button on the right of this issue. The code will be released before April 30, 2022.

Training nice-slam on rgb videos

Hi,

Thanks for providing the code of nice-slam. I am a complete newbie to SLAM methods but from my perspective, I think your method can be trained without using depth supervision. Have you ever tried that ?

The reason I ask for this is that I see that your method is somehow related to NeRF which can reconstruct quite accurate depth maps.

About projection.

In Mapper.py,
in get_mask_from_c2w:
cam_cord[:, 0] *= -1
why are x of cam_cord needed to multiply by -1?
Besides,
mask = mask & (0 <= -z[:, :, 0]) & (-z[:, :, 0] <= depths+0.5)
why do you use the -z[:, :, 0] rather than z[:, :, 0]?

in keyframe_selection_overlap:
mask = mask & (z[:, :, 0] < 0)
why (z[:, :, 0] < 0)?

Pose Initializing

Hi~
Congrats on your great job! I wanna ask how to initialize the pose at the very beginning step, or how can i get the origin pose at the first step in mapping step(cause mapping is ahead of tracking).

Error: global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 30: Can't spawn new thread

Hi,

congrats on your excellent work and thanks a lot for releasing the code. I just successfully tested the program for all datasets provided except the scene of self-captured apartment, which throws the error message below:

(nice-slam) root@ubuntu:~/NeRF_SLAM/nice-slam# python -W ignore run.py configs/Apartment/apartment.yaml
INFO: The output folder is output/Apartment
INFO: The GT, generated and residual depth/color images can be found under output/Apartment/tracking_vis/ and output/Apartment/mapping_vis/
INFO: The mesh can be found under output/Apartment/mesh/
INFO: The checkpoint can be found under output/Apartment/ckpt/
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 29: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 30: Can't spawn new thread: res = 11
cannot allocate memory for thread-local data: ABORT
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 30: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 31: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 32: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 33: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 34: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 35: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 36: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 37: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 38: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 39: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 40: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 41: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 42: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 43: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 44: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 45: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 46: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 47: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 48: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 49: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 50: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 51: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 52: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 53: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 54: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 55: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 56: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 57: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 58: Can't spawn new thread: res = 11
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 59: Can't spawn new thread: res = 11
terminate called after throwing an instance of 'std::bad_alloc'
[ERROR:[email protected]] global /io/opencv/modules/core/src/parallel_impl.cpp (240) WorkerThread 60: Can't spawn new thread: res = 11
terminate called recursively
what(): std::bad_alloc

I'm testing the program on a remote server with Ubuntu 20.04 and a NVIDIA A100 80GB PCIe GPU, so it's certainly not a memory related issue.
Any hints or suggestions would be helpful!

About the pretraining of the decoders

Hi, I notice that the decoders are pre-trained as a part of ConvONet as mentioned in the paper. Which dataset are the decoders pretrained on and what is the training settings? Thanks.

IndexError: list index out of range

Hi, thank you for such an excellent work.

When I try to run the demo, I get an error like this:

Process Process-3:
Traceback (most recent call last):
  File "/home/chenguangyan/anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/chenguangyan/anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/chenguangyan/dengyinan/Nice-SLAM/src/NICE_SLAM.py", line 285, in coarse_mapping
    self.coarse_mapper.run()
  File "/home/chenguangyan/dengyinan/Nice-SLAM/src/Mapper.py", line 544, in run
    idx, gt_color, gt_depth, gt_c2w = self.frame_reader[0]
  File "/home/chenguangyan/dengyinan/Nice-SLAM/src/utils/datasets.py", line 78, in __getitem__
    color_path = self.color_paths[index]
IndexError: list index out of range

I don't think I changed any code. If you can have some suggestions, I would be very grateful.

failed to run demo in windows

(torch_1_x) D:\work\code\3d_reconstruction_ai\nice-slam-master>python -W ignore run.py configs/Demo/demo.yaml
INFO: The output folder is output/Demo
INFO: The GT, generated and residual depth/color images can be found under output/Demo/vis/
INFO: The mesh can be found under output/Demo/mesh/
INFO: The checkpoint can be found under output/Demo/ckpt/
Tracking Frame 0: 0%|▎ | 1/500 [00:07<58:53, 7.08s/it]
Tracking Frame 1: 0%|▎ | 1/500 [00:07<58:53, 7.08s/it]
Process Process-3:
Traceback (most recent call last):
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\multiprocessing\process.py", line 297, in _bootstrap
self.run()
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\multiprocessing\process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\NICE_SLAM.py", line 286, in coarse_mapping
self.coarse_mapper.run()
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\Mapper.py", line 606, in run
gt_c2w, self.keyframe_dict, self.keyframe_list, cur_c2w=cur_c2w)
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\Mapper.py", line 316, in optimize_map
mask_c2w, key, val.shape[2:], gt_depth_np)
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\Mapper.py", line 118, in get_mask_from_c2w
w2c = np.linalg.inv(c2w)
File "<array_function internals>", line 6, in inv
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\site-packages\numpy\linalg\linalg.py", line 545, in inv
ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\site-packages\numpy\linalg\linalg.py", line 88, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix
Process Process-2:
Traceback (most recent call last):
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\multiprocessing\process.py", line 297, in _bootstrap
self.run()
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\multiprocessing\process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\NICE_SLAM.py", line 276, in mapping
self.mapper.run()
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\Mapper.py", line 606, in run
gt_c2w, self.keyframe_dict, self.keyframe_list, cur_c2w=cur_c2w)
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\Mapper.py", line 316, in optimize_map
mask_c2w, key, val.shape[2:], gt_depth_np)
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\Mapper.py", line 118, in get_mask_from_c2w
w2c = np.linalg.inv(c2w)
File "<array_function internals>", line 6, in inv
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\site-packages\numpy\linalg\linalg.py", line 545, in inv
ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\site-packages\numpy\linalg\linalg.py", line 88, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix
Tracking Frame 1: 0%|▋ | 2/500 [00:08<31:21, 3.78s/it]
Tracking Frame 2: 0%|▋ | 2/500 [00:08<31:21, 3.78s/it]
Tracking Frame 2: 0%|▋ | 2/500 [00:09<41:07, 4.95s/it]

Process Process-1:
Traceback (most recent call last):
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\multiprocessing\process.py", line 297, in _bootstrap
self.run()
File "E:\ProgramData\Anaconda3\envs\torch_1_x\lib\multiprocessing\process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\NICE_SLAM.py", line 266, in tracking
self.tracker.run()
File "D:\work\code\3d_reconstruction_ai\nice-slam-master\src\Tracker.py", line 195, in run
device).float().inverse()
RuntimeError: inverse_cuda: For batch 0: U(1,1) is zero, singular U.

Sharing pretrained weights on Hugging Face

Hello there!

First of all, thank you for open-sourcing your work! I saw that the weights are currently stored in /pretrained – would you be interested in sharing your model weights on the Hugging Face Hub?

The Hub makes it easy to freely download and upload models, and it can make models more accessible and visible to the rest of the ML community. It's good way to share useful metadata and metrics, and we also support features like TensorBoard visualizations and PapersWithCode integrations. Since models are hosted as Git repos, they're also automatically versioned with a commit history and diffs. We could even help you set up a "ETH CV and Geometry Lab" organization (e.g. see the Facebook AI or Stanford NLP organizations).

We have a step-by-step guide that explains the process for uploading the model to the Hub, in case you're interested. We also have a library for programmatic access to models which includes features like caching for downloaded models.

Please let us know if you have any questions, and we'd be happy to guide you through the process!

Nima and the Hugging Face team

cc @osanseviero @lhoestq

environment error

Hi, thank you for such an excellent work.

When I try to install the environment, it gives an error like ResolvePackageNotFound or Found conflicts!

How can I solve these problems or is there other ways to install the environment?

Question on grids initialization

First I want to say thanks for your work!

It confuses me that during grids initialization, the orders of axis X and Z are all swapped, like the following line does:

coarse_val_shape[0], coarse_val_shape[2] = coarse_val_shape[2], coarse_val_shape[0]

Can you help explain why need swapping the axes here, thanks in advance!

how to use 3070ti to run nice-slam

I run NICE-SLAM on a short ScanNet sequence with 500 frames, following the readme.md, which notes that It takes a few minutes with ~5G GPU memory. Although I have 8G GPU memory, but it occured the error

RuntimeError: CUDA error: out of memory

so how can i adopt the hyperparameters to fit my configuration?
I have seen other issues, which you notes that your classmates have run successfully in 1080ti. I think my configuration is enough, right?

out of memory using V100

RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 15.78 GiB total capacity; 1.22 GiB already allocated; 8.04 GiB free; 1.32 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Process Process-3:

How do you associate the coarse mapping with mapping and tracking?

I read your codes, and I can't find the associations with the coarse mapping and other 2 threads. However, in your paper, the coarse mapping is helpful to Tracking thread.
Also, in 3.3 Mapping, the loss = coarse geometric loss + fine geometric loss + photometric losses. However, I feel that your codes only consider coarse geometric loss in coarse mapping, and only consider mid&fine geometric loss in mapping, so is the formula (10) correct?

Finally, my biggest confusion is that how do you use the information from coarse mapping in tracking thread and mapping thread. In tracking thread, although you update parameters from mapping and update the feature grid, however, the stage of tracking is "color", and I think that tracking cannot use the coarse mapping information. I will be very appreciate if you can tell me about these questions.

Naming issue

Dear author,

the name is nice-slam,

How does the follower of this work name? Better-slam or very nice-slam?

Render quality drops when gt_depth is not given

Thanks for the great work and releasing the code.

Here I am testing the inference(rendering) quality of Nice-SLAM, i.e. loading pretrained decoders and feature grids from the provided checkpoints and then calling visualizer.vis with a camera pose as input. I noticed that gt_depth is an optional parameter in the method render_batch_ray. I obtained the following two results, one with gt_depth passed to render_batch_ray and one without, respectively.
image
image

The latter seems to have a non-smooth depth prediction and thus non-smooth color prediction as well. When gt_depth is None, a chunk of code in render_batch_ray at Renderer.py:112-150 is skipped, which I assume is what might caused this (I don't have a good background in NeRF/rendering, so correct me if I am wrong). Are those results expected? If ground truth depth is required at inference time to get good quality predictions, then what's the point of predicting depth?

Qt issue when test Apartment data

Hi:

I have successfully test all other provided data in this repo.
However, only when test Apartment, met following error:

Coarse Mapping Frame  50

This application failed to start because it could not find or load the Qt platform plugin "xcb"
in "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/cv2/qt/plugins".

Available platform plugins are: eglfs, minimal, minimalegl, offscreen, vnc, xcb.

Reinstalling the application may fix this problem.
QStandardPaths: wrong ownership on runtime directory /usr/lib/, 0 instead of 1000
Process Process-3:
Traceback (most recent call last):
  File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/dlr/Project/nice-slam/src/NICE_SLAM.py", line 283, in coarse_mapping
    self.coarse_mapper.run()
  File "/home/dlr/Project/nice-slam/src/Mapper.py", line 606, in run
    gt_c2w, self.keyframe_dict, self.keyframe_list, cur_c2w=cur_c2w)
  File "/home/dlr/Project/nice-slam/src/Mapper.py", line 428, in optimize_map
    idx, joint_iter, cur_gt_depth, cur_gt_color, cur_c2w, self.c, self.decoders)
  File "/home/dlr/Project/nice-slam/src/utils/Visualizer.py", line 85, in vis
    gt_color_np = np.clip(gt_color_np, 0, 1)
  File "<__array_function__ internals>", line 6, in clip
  File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2115, in clip
    return _wrapfunc(a, 'clip', a_min, a_max, out=out, **kwargs)
  File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)
  File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/numpy/core/_methods.py", line 160, in _clip
    um.clip, a, min, max, out=out, casting=casting, **kwargs)
  File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/numpy/core/_methods.py", line 113, in _clip_dep_invoke_with_casting
    return ufunc(*args, out=out, **kwargs)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 21.1 MiB for an array with shape (720, 1280, 3) and data type float64

then the process just stops...

I have not changed the yaml in config/.. so strange here

Grid Normalization: potential bug at src.utils.Mesher

Hi,
Thanks for sharing this great work and code. I found the code in Mesher.py may contain some potential bugs, please check:

gt_depth = keyframe['depth'].to(device).reshape(1, 1, H, W)          
vgrid = uv.reshape(1, 1, -1, 2)
depth_sample = F.grid_sample(gt_depth, vgrid, padding_mode='zeros', align_corners=True)
depth_sample = depth_sample.reshape(-1)
max_depth = torch.max(depth_sample)

Here, vgrid, i.e., uv, is the projected coordinates, while F.grid_sample requires the coordinates to be normalized into [-1, 1].
Therefore, I think it should look like this:

gt_depth = keyframe['depth'].to(device).reshape(1, 1, H, W)          
vgrid = uv.reshape(1, 1, -1, 2)
# normalized to [-1, 1]
vgrid[..., 0] = (vgrid[..., 0] / (W-1) * 2.0 - 1.0)
vgrid[..., 1] = (vgrid[..., 1] / (H-1) * 2.0 - 1.0)
depth_sample = F.grid_sample(gt_depth, vgrid, padding_mode='zeros', align_corners=True)
depth_sample = depth_sample.reshape(-1)
max_depth = torch.max(depth_sample)

Please correct me if I'm wrong. Thanks.

Download url may not working

Hi @Zzh2000 @pengsongyou, cheers to your great work!

However, when I tried to download some of the demos/inferences, some of the download scripts returned

HTTP request sent, awaiting response... 403 Forbidden
And it's hard for foreign users to use hecaiyun.

Would you mind update the url for the files such as: cull_replica_mesh.zip, Demo.zip, etc.

Thanks

RuntimeError:cuda out of memory and thread_monitor resource temporarily unavailable in pthread_create

it post 'cuda out of memory' when I run demo at 350 frames with RTX2080Ti , running with A10 is ok.
it post 'cuda out of memory' or 'thread_monitor resource temporarily unavailable in pthread_create ' when I run apartment at 155 frames with A10, meanwhile one of five python process fork 65 threads when run 155 frames. env config is right
RuntimeError: CUDA out of memory . Tried to allocate 178.00 MiB(GPU 0; 22.20 GiB total capacity; 402.81 MiB already allocated; 12.79 GiB free; 468.00 MiB reserved in total by PyTorch)

numpy.core._exceptions._ArrayMemoryError: when test Apartment dataset

Hi:

Thanks for your nice work !!!

I have successfully test Replica dataset in this repo.
My GPU is RTX 3090 on a remoted Linux server.
However, only when test Apartment, met following error:

Tracking Frame 48

Re-rendering loss: 467.40->344.46 camera tensor error: 0.0018->0.0083

Tracking Frame 49

Re-rendering loss: 456.63->417.85 camera tensor error: 0.0016->0.0093

Tracking Frame 50

Process Process-1:
Traceback (most recent call last):
File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/dlr/Project/nice-slam/src/NICE_SLAM.py", line 283, in coarse_mapping
self.coarse_mapper.run()
File "/home/dlr/Project/nice-slam/src/Mapper.py", line 606, in run
gt_c2w, self.keyframe_dict, self.keyframe_list, cur_c2w=cur_c2w)
File "/home/dlr/Project/nice-slam/src/Mapper.py", line 428, in optimize_map
idx, joint_iter, cur_gt_depth, cur_gt_color, cur_c2w, self.c, self.decoders)
File "/home/dlr/Project/nice-slam/src/utils/Visualizer.py", line 85, in vis
gt_color_np = np.clip(gt_color_np, 0, 1)
File "<array_function internals>", line 6, in clip
File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2115, in clip
return _wrapfunc(a, 'clip', a_min, a_max, out=out, **kwargs)
File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
return bound(*args, **kwds)
File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/numpy/core/_methods.py", line 160, in _clip
um.clip, a, min, max, out=out, casting=casting, **kwargs)
File "/home/dlr/anaconda3/envs/nice-slam/lib/python3.7/site-packages/numpy/core/_methods.py", line 113, in _clip_dep_invoke_with_casting
return ufunc(*args, out=out, **kwargs)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 21.1 MiB for an array with shape (720, 1280, 3) and data type float64

Maybe it's the same problem with #12, but according to it I didn't solve the error. And i'm sorry to create another issue below your project.

I notice there are some "imshow" codes in Visualizer.py. I run NICE-SLAM via remoted Linux server by SSH. Will it cause the issue ?

Thanks for your replying !

Debug / breakpoint

This isn't an issue.
Just to ask how do you debug (eg. set breakpoint) with this multiprocessing case?

how to set scene bound when run outdoor scene?

Hi~

Thans for sharing this amazing work!

I am testing on outdoor scene : vkitti2,but I haven't figured out how to set bound in mapping.

For example, according ground truth poses at word frame(first frame), I can get the boundary of whole trajectory, e.g. [-137,2],[-1,10],[0,214],

then how to properly decide the bound and marching_cubes_bound parameters in order to get reasonable mesh result?

Quick question about training time

Hello authors, thank you for your great work. I have a very quick question: how long does it take to train a model (for example, on the Apartment scene)?

I am slightly confused because I thought the method was real-time (e.g. on an RTX 3090), so I was expecting the model to run extremely quickly. However, for the code takes many hours to run?

For context, I am using an RTX 8000 (Python 3.8, CUDA 11.1, PyTorch 1.10.0), which is slightly slower than an RTX 3090 but still quite fast. I am running the following command:

python -W ignore run.py configs/Apartment/apartment.yaml

I think I must be misunderstanding something about the method, or else something must be wrong with my setup?

Thanks for your assistance.

Regarding - Decoders

Hello @Zzh2000 @pengsongyou,
First and foremost, thanks a lot for your NICE project :)

I have a question regarding the decoders,

  1. Could you provide any further instruction/code for the Point Cloud Encoder-Decoder network that is trained
    for obtaining the pretrained decoder part?

  2. [I can't find/figure out] the part of the code where you fixed the weights for Mid-&Fine-level decoders, as given in the paper.
    fixed_weight
    Any pointers will be helpful.

I am sorry if these questions are naive or if the answer already appears on the git but I could not find a clear solution.

Thank you very much.

Much higher mapping time during test time.

Hey Authors,
Firstly, thanks for the ease of use of your code. Running it was almost seamless on multiple machines / datasets!

Although, after timing the code for the Apartment datataset, the mapping step seems to be considerably slower on my machine (100ms vs 500ms).
The CPU is an i9-10900X CPU @ 3.70GHz and the GPU an RTX 3090 . (All 10 cores / 20 threads seem to be used during training/inference)
The paper claims to be using a 3.80GHz Intel i7-10700K CPU which only has 8 cores of an older generation.

I doubt the small difference in the clock frequecy can cause a difference of 5 fold. What do you guys think it could be?

inversion could not be completed because the imput matrix is singular

in Tracker.py, at line 195, prc_c2w divide the estimated_c2w, however estimated_c2w has 0 norm, thus uninvereble, anyone knows how to fix it?

I test nice-slam on a 3080TI LAPTOP GPU, and on Apartment dataset and Replica room0 dataset, the problem remains.

If I set "const_speed_assumption" to "False", there's no need to do inversion, and the above problem disappeared, but the Depth residual and RGB residual output just as the same as the input image as below.

image

so, Why? And how to fix it?

running on dataset that does not contain gt_camera_poses

Hi, thanks for publishing this code.
I want to use only color images and depth maps for scene reconstruction, and I have set the 'gt_camera' parameter in the configure file to False. Unfortunately it seems that the code has to read the poses information for some calculations. How do I solve this problem?

mesh has no attribute vertices

when I run the demo, it shows the problem as follows:

File "nice-slam-master/src/Mapper.py", line 646, in run
clean_mesh=self.clean_mesh, get_mask_use_all_frames=False)
File "nice-slam-master/src/utils/Mesher.py", line 509, in get_mesh
vertices = mesh.vertices
AttributeError: 'list' object has no attribute 'vertices'

Why mesh was treated as a list?

Matrix Inverse Error

Great work.

When I try this code, I met a same error for several senarios:

Process Process-1:   
Traceback (most recent call last):                                                                                                                                                                                  
   File "/home/zzl/miniconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap                                                                                               
      self.run()                                                                                                                                                                                                      
   File "/home/zzl/miniconda3/envs/nice-slam/lib/python3.7/multiprocessing/process.py", line 99, in run                                                                                                        
      self._target(*self._args, **self._kwargs)                                                                                                                                                                      
   File "/home/zzl/workspace/nice-slam/src/NICE_SLAM.py", line 263, in tracking                                                                                                                                
      self.tracker.run()                                                                                                                                                                                              
   File "/home/zzl/workspace/nice-slam/src/Tracker.py", line 195, in run                                                                                                                                       
     device).float().inverse()         
torch._C._LinAlgError: cusolver error: CUSOLVER_STATUS_EXECUTION_FAILED, when calling `cusolverDnSgetrf( handle, m, n, dA, ldda, static_cast<float*>(dataPtr.get()), ipiv, info)`. This error may appear if the input matrix contains NaN.

So far, this happens for imap in Apartment and nice-slam in TUM_RGBD/freiburg1 (nice-slam in Apartment and imap in TUM_RGBD/freiburg1 are ok). I haven't try other scenarios yet.

This error usually happens in first five frames optimization and is reproducible. I haven't change any code yet, so I wonder why the matrix that can't be inversed emerges.

How did you get the memory results in the paper?

In Reconstruction Results for the Replica Dataset table, I see the memory results for all methods.
How did you get the memory results? I cannot get the memory results according to the project.
Could you tell me how can I compute the memory results?

Performance on outdoor scenes

Hi, thanks for publishing the code.

I am working mainly with outdoor scenes and would like to know if you tested nice-slam in outdoor environments before I run my own tests. Thanks a lot!

about MLP decoders.

在decoder.py中,初始化fine-level decoder时,
self.fine_decoder = MLP(name='fine', dim=dim, c_dim=c_dim2, color=False, skips=[2], n_blocks=5, hidden_size=hidden_size, grid_len=fine_grid_len, concat_feature=True, pos_embedding_method=pos_embedding_method)
这里为什么feature dimension, 即c_dim取值c_dim
2=64呢?
论文原文:For all MLP decoders, we use a hidden feature dimension of 32 and 5 fully-connected blocks.
按论文来说应该都是32的feature dimension?

how I can run real time RGB-D camera

Hello,

Can we use RGB-D (realsense d435i) as real-time? If we use it, how do we need to run it in real time?

Thank you in advance for your answer.

How did you get the FLOPs results?

How did you get the FLOPs results?
Especially for iMAP, it is not opensource and not show the result in their paper.
So how did you get the iMAP's FLOPs?
Also, can you tell me how did you get the FLOPs result of your method? I can't get the result using thop.profile() function.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.