Giter Club home page Giter Club logo

sdfusion's Introduction

SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation

[arXiv] [Project Page] [BibTex]

Code release for the CVPR 2023 paper "SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation".

1-teaser-v3-out-1 SDFusion is a diffusion-based 3D shape generator. It enables various applications. (left) SDFusion can generate 3D shapes conditioned on different input modalities, including partial shapes, images, and text. SDFusion can even jointly handle multiple conditioning modalities while controlling the strength for each of them. (right) We showcase an application where we leverage pretrained 2D models to texture 3D shapes generated by SDFusion.

We also use a 3D-printer to print out the generated shapes of SDFusion.

3d_print.mp4

Installation

We recommend using conda to install the required python packages. You might need to change the cudatoolkit version to match with your GPU driver.

conda create -n sdfusion python=3.8 -y && conda activate sdfusion
conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.3 -c pytorch -c conda-forge -y
conda install -c fvcore -c iopath -c conda-forge fvcore iopath -y
conda install pytorch3d -c pytorch3d

pip install h5py joblib termcolor scipy einops tqdm matplotlib opencv-python PyMCubes imageio trimesh omegaconf tensorboard notebook

Usage

Download the pretrained weight

First create a foler to save the pre-trained weights. Here we assume the folder is ./saved_ckpt. Then download the pre-trained weights from the provided links and put them in the ./saved_ckpt folder.

mkdir saved_ckpt

# VQVAE's checkpoint
wget https://uofi.box.com/shared/static/zdb9pm9wmxaupzclc7m8gzluj20ja0b6.pth -O saved_ckpt/vqvae-snet-all.pth

# SDFusion
wget https://uofi.box.com/shared/static/ueo01ctnlzobp2dmvd8iexy1bdsquuc1.pth -O saved_ckpt/sdfusion-snet-all.pth

# SDFusion: single-view reconstruction (img2shape)
wget https://uofi.box.com/shared/static/01hnf7pbewft4115qkvv9zhh22v4d8ma.pth -O saved_ckpt/sdfusion-img2shape.pth

# SDFusion: text-guided shape generation (txt2shape)
wget https://uofi.box.com/shared/static/vyqs6aex3rwbgxweyl3qh21c8p6vu33f.pth -O saved_ckpt/sdfusion-txt2shape.pth

# SDFusion: multi-modal conditional shape generation (partial shape + [ img {and/or} txt] -> shape)
wget https://uofi.box.com/shared/static/d95l3465arc0ffley5vwmz8bscaubmhc.pth -O saved_ckpt/sdfusion-mm2shape.pth

Demo

Please check the provided jupyter notebooks for how to use the code. First open the jupyter notebook server.

jupyter notebook

Then, open one of the following notebooks for the task you want to perform.

  1. Unconditional generation and shape completion: demo_uncond_shape_comp.ipynb
  2. Single-view reconstruction (img2shape): demo_img2shape.ipynb
  3. Text-guided shape generation (txt2shape): demo_txt2shape.ipynb
  4. Multi-modal conditional shape generation (partial shape + [ img | txt ] ): demo_mm2shape.ipynb
  5. (coming soon!) Text-guided Texturization: demo_txt2tex.ipynb

Note that the notebooks will automatically save the generated shapes in the ./demo_results folder.

How to train the SDFusion

Preparing the data

  • First, depending on your OS, you might need to install the required packages/binaries via brew or apt-get for computing the SDF given a mesh. If you cannot run the preprocessing files, please ctrl+c & ctrl+v the error message and search it on Google (usually there will be a one-line solution), or open an issue on this repo. We will try to update the README with the reported issues and their solutions under the Issues and FAQ section.

  • ShapeNet

    1. Download the ShapeNetV1 dataset from the official website. Then, extract the downloaded file and put the extracted folder in the ./data folder. Here we assume the extracted folder is at ./data/ShapeNet/ShapeNetCore.v1.
    2. Run the following command for preprocessing the SDF from mesh.
mkdir -p data/ShapeNet && cd data/ShapeNet
wget [url for downloading ShapeNetV1]
unzip ShapeNetCore.v1.zip
./launchers/unzip_snet_zipfiles.sh # unzip the zip files
cd preprocess
./launchers/launch_create_sdf_shapenet.sh
  • BuildingNet
    1. Download the BuildingNet dataset from the official website. After you fill out the form, please download the v0 version of the dataset and uncompress it under ./data. Here we assume the extracted folder is ./data/BuildingNet_dataset_v0_1.
    2. Run the following command for preprocessing the SDF from mesh.
cd preprocess
./launchers/launch_create_sdf_building.sh
cd ../
wget http://pix3d.csail.mit.edu/data/pix3d.zip -P data
cd data
unzip pix3d.zip
cd ../
- Then, run the following command for preprocessing the SDF from mesh.
cd preprocess
./launchers/launch_create_sdf_pix3d.sh
cd ../
  • ShapeNetRendering
    • Run the following command for getting the rendering images, which is provided by the 3D-R2N2 paper.
wget ftp://cs.stanford.edu/cs/cvgl/ShapeNetRendering.tgz -P data/ShapeNet
cd data/ShapeNet && tar -xvf ShapeNetRendering.tgz
cd ../../
  • text2shape
    • Run the following command for setting up the text2shape dataset.
mkdir -p data/ShapeNet/text2shape
wget http://text2shape.stanford.edu/dataset/captions.tablechair.csv -P data/ShapeNet/text2shape
cd preprocess
./launchers/create_snet-text_splits.sh

Training

  1. Train VQVAE
# ShapeNet
./launchers/train_vqvae_snet.sh

# BuildingNet
./launchers/train_vqvae-bnet.sh

After training, copy the trained VQVAE checkpoint to the ./saved_ckpt folder. Let's say the name of the checkpoints are vqvae-snet-all.ckpt or vqvae-bnet-all.ckpt. This is necessary for training the Diffusion model. For SDFusion on various tasks, please see 2.~5. below.

  1. Train SDFusion on ShapeNet and BuildingNet
# ShapeNet
./launchers/train_sdfusion_snet.sh

# BuildingNet
./launchers/train_sdfusion_bnet.sh
  1. Train SDFusion for single-view reconstruction
./launchers/train_sdfusion_img2shape.sh
  1. Train SDFusion for text-guided shape generation
# text2shape
./launchers/train_sdfusion_txt2shape.sh
  1. Train SDFusion for multi-modality shape generation
./launchers/train_sdfusion_mm2shape.sh
  1. Train the text-guided texturization
coming soon!

Citation

If you find this code helpful, please consider citing:

  1. Conference version
@inproceedings{cheng2023sdfusion,
  author={Cheng, Yen-Chi and Lee, Hsin-Ying and Tuyakov, Sergey and Schwing, Alex and Gui, Liangyan},
  title={{SDFusion}: Multimodal 3D Shape Completion, Reconstruction, and Generation},
  booktitle={CVPR},
  year={2023},
}
  1. arxiv version
@article{cheng2022sdfusion,
  author = {Cheng, Yen-Chi and Lee, Hsin-Ying and Tuyakov, Sergey and Schwing, Alex and Gui, Liangyan},
  title = {{SDFusion}: Multimodal 3D Shape Completion, Reconstruction, and Generation},
  journal = {arXiv},
  year = {2022},
}

Issues and FAQ

Coming soon!

Acknowledgement

This code borrows heavely from LDM, AutoSDF, CycleGAN, stable dreamfusion, DISN. We thank the authors for their great work. The followings packages are required to compute the SDF: freeglut3, tbb.

This work is supported in part by NSF under Grants 2008387, 2045586, 2106825, MRI 1725729, and NIFA award 2020-67021-32799. Thanks to NVIDIA for providing a GPU for debugging.

sdfusion's People

Contributors

yccyenchicheng avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.