Giter Club home page Giter Club logo

neuralangelo_dfd's Introduction

Neuralangelo with directional finite difference (DFD)

This is a modified implementation of Neuralangelo: High-Fidelity Neural Surface Reconstruction to demonstrate the usage of DFD in NeuS-like multi-view reconstruction. We accelerate the training process by using DFD and patch-based sampling to compute the gradients of the signed distance field (SDF) in the forward rendering.

In short, we use SDF samples on a patch of rays for both SDF gradient computation and volume rendering. This avoids redundant SDF samples in the forward rendering, as used in Neuralangelo. This acceleration strategy is orthogonal to the multi-resolution hash encoding, fast ray marching, and CUDA implementation. More details can be found in our paper SuperNormal: Neural Surface Reconstruction via Multi-View Normal Integration.

Usage

Requirements and usage is the same as Neuralangelo. You can switch between the original pixel-based rendering and our patch-based rendering by setting model.render.render_mode in the configuration file projects/neuralangelo/configs/base.yaml.

Major code modifications
  • We add patch-based random sampling in projects/neuralangelo/data.py.

  • We add patch-based volume rendering in projects/neuralangelo/model.py.

  • We add DFD computation in projects/neuralangelo/utils/modules.py.

Note: This is a quick and dirty implementation during the review process. Please refrain from asking for further feature updates.

Note 2: Even with DFD acceleration, the training process is still slow because

  • Deep MLPs are used for the SDF and color network, and
  • Coarse-to-fine ray sampling is used, which requires multiple feedforward passes of the SDF network in each iteration.

Following is the original README from Neuralangelo.


This is the official implementation of Neuralangelo: High-Fidelity Neural Surface Reconstruction.

Zhaoshuo Li, Thomas Müller, Alex Evans, Russell H. Taylor, Mathias Unberath, Ming-Yu Liu, Chen-Hsuan Lin
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

The code is built upon the Imaginaire library from the Deep Imagination Research Group at NVIDIA.
For business inquiries, please submit the NVIDIA research licensing form.

Installation

We offer two ways to setup the environment:

  1. We provide prebuilt Docker images, where

    • docker.io/chenhsuanlin/colmap:3.8 is for running COLMAP and the data preprocessing scripts. This includes the prebuilt COLMAP library (CUDA-supported).
    • docker.io/chenhsuanlin/neuralangelo:23.04-py3 is for running the main Neuralangelo pipeline.

    The corresponding Dockerfiles can be found in the docker directory.

  2. The conda environment for Neuralangelo. Install the dependencies and activate the environment neuralangelo with

    conda env create --file neuralangelo.yaml
    conda activate neuralangelo

For COLMAP, alternative installation options are also available on the COLMAP website.


Data preparation

Please refer to Data Preparation for step-by-step instructions.
We assume known camera poses for each extracted frame from the video. The code uses the same json format as Instant NGP.


Run Neuralangelo!

EXPERIMENT=toy_example
GROUP=example_group
NAME=example_name
CONFIG=projects/neuralangelo/configs/custom/${EXPERIMENT}.yaml
GPUS=1  # use >1 for multi-GPU training!
torchrun --nproc_per_node=${GPUS} train.py \
    --logdir=logs/${GROUP}/${NAME} \
    --config=${CONFIG} \
    --show_pbar

Some useful notes:

  • This codebase supports logging with Weights & Biases. You should have a W&B account for this.
    • Add --wandb to the command line argument to enable W&B logging.
    • Add --wandb_name to specify the W&B project name.
    • More detailed control can be found in the init_wandb() function in imaginaire/trainers/base.py.
  • Configs can be overridden through the command line (e.g. --optim.params.lr=1e-2).
  • Set --checkpoint={CHECKPOINT_PATH} to initialize with a certain checkpoint; set --resume to resume training.
  • If appearance embeddings are enabled, make sure data.num_images is set to the number of training images.

Isosurface extraction

Use the following command to run isosurface mesh extraction:

CHECKPOINT=logs/${GROUP}/${NAME}/xxx.pt
OUTPUT_MESH=xxx.ply
CONFIG=logs/${GROUP}/${NAME}/config.yaml
RESOLUTION=2048
BLOCK_RES=128
GPUS=1  # use >1 for multi-GPU mesh extraction
torchrun --nproc_per_node=${GPUS} projects/neuralangelo/scripts/extract_mesh.py \
    --config=${CONFIG} \
    --checkpoint=${CHECKPOINT} \
    --output_file=${OUTPUT_MESH} \
    --resolution=${RESOLUTION} \
    --block_res=${BLOCK_RES}

Some useful notes:

  • Add --textured to extract meshes with textures.
  • Add --keep_lcc to remove noises. May also remove thin structures.
  • Lower BLOCK_RES to reduce GPU memory usage.
  • Lower RESOLUTION to reduce mesh size.

Frequently asked questions (FAQ)

  1. Q: CUDA out of memory. How do I decrease the memory footprint?
    A: Neuralangelo requires at least 24GB GPU memory with our default configuration. If you run out of memory, consider adjusting the following hyperparameters under model.object.sdf.encoding.hashgrid (with suggested values):

    GPU VRAM Hyperparameter
    8GB dict_size=20, dim=4
    12GB dict_size=21, dim=4
    16GB dict_size=21, dim=8

    Please note that the above hyperparameter adjustment may sacrifice the reconstruction quality.

    If Neuralangelo runs fine during training but CUDA out of memory during evaluation, consider adjusting the evaluation parameters under data.val, including setting smaller image_size (e.g., maximum resolution 200x200), and setting batch_size=1, subset=1.

  2. Q: The reconstruction of my custom dataset is bad. What can I do?
    A: It is worth looking into the following:

    • The camera poses recovered by COLMAP may be off. We have implemented tools (using Blender or Jupyter notebook) to inspect the COLMAP results.
    • The computed bounding regions may be off and/or too small/large. Please refer to data preprocessing on how to adjust the bounding regions manually.
    • The video capture sequence may contain significant motion blur or out-of-focus frames. Higher shutter speed (reducing motion blur) and smaller aperture (increasing focus range) are very helpful.

Citation

If you find our code useful for your research, please cite

@inproceedings{li2023neuralangelo,
  title={Neuralangelo: High-Fidelity Neural Surface Reconstruction},
  author={Li, Zhaoshuo and M\"uller, Thomas and Evans, Alex and Taylor, Russell H and Unberath, Mathias and Liu, Ming-Yu and Lin, Chen-Hsuan},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2023}
}

neuralangelo_dfd's People

Contributors

chenhsuanlin avatar xucao-42 avatar mli0603 avatar

Stargazers

 avatar Fan Jiang avatar  avatar  avatar Yongmoon Park avatar L.JIE avatar Vincent Ho avatar WuKe avatar Dengzhi avatar  avatar  avatar Satoxx avatar Ma Hui avatar Qilong avatar Xiao Chen avatar Zhenyu Tang avatar shawlyu avatar Shrisha Bharadwaj avatar Jingnan Gao avatar julius avatar  avatar Rekkles avatar YeChongjie avatar Lu Ming avatar yqdch avatar Yue Pan  avatar Wenbo Ji 嵇文博 avatar mika avatar Yuliang Xiu avatar

Watchers

 avatar Zhenyu Tang avatar

Forkers

jackzhousz

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.