Giter Club home page Giter Club logo

hash3d's Introduction


Training-free Acceleration
for 3D Generation 🏎️💨

Introduction

This repository contains the official implementation for our paper

Hash3D: Training-free Acceleration for 3D Generation

🥯[Project Page] 📝[Paper] </>[code]

Xingyi Yang, Xinchao Wang

National University of Singapore

pipeline

We present Hash3D, a universal solution to acclerate score distillation samplin (SDS) based 3D generation. By effectively hashing and reusing these feature maps across neighboring timesteps and camera angles, Hash3D substantially prevents redundant calculations, thus accelerating the diffusion model's inference in 3D generation tasks.

What we offer:

  • ⭐ Compatiable to Any 3D generation method using SDS.
  • ⭐ Inplace Accerlation for 1.3X - 4X.
  • ⭐ Training-Free.

Results Visualizations

Image-to-3D Results

Input Image Zero-1-to-3 Hash3D + Zero-1-to-3 $${\color{red} \text{(Speed X4.0)}}$$

baby_phoenix_on_ice (1)

phoenix_zero123.mp4
phoenix_hash_zero123.mp4

grootplant_rgba (1)

grootplant_zero123.mp4
grootplant_hash_zero123.mp4

Text-to-3D Results

Prompt Gaussian-Dreamer Hash3D + Gaussian-Dreamer $${\color{red}\text{(Speed X1.5)}}$$
A bear dressed as a lumberjack
a.bear.dressed.as.a.lumberjack.mp4
a.bear.dressed.as.a.lumberjack_hash.mp4
A train engine made out of clay
a.train.engine.made.out.of.clay.mp4
a.train.engine.made.out.of.clay_hash.mp4

Project Structure

The repository is organized into three main directories, each catering to a different repo that Hash3D can be applied on:

  1. threesdtudio-hash3d: Contains the implementation of Hash3D tailored for use with the threestudio.
  2. dreamgaussian-hash3d: Focuses on integrating Hash3D with the DreamGaussian for image-to-3D generation.
  3. gaussian-dreamer-hash3d: Dedicated to applying Hash3D to GaussianDreamer for faster text-to-3D tasks.

What we add?

The core implementation is in the guidance_loss for each SDS loss computation. We

See hash3D/threestudio-hash3d/threestudio/models/guidance/zero123_unified_guidance_cache.py for example. The code for the hash table implementation is in hash3D/threestudio-hash3d/threestudio/utils/hash_table.py.

Getting Started

Installation

Navigate to each of the specific directories for environment-specific installation instructions.

Usage

Refer to the README within each directory for detailed usage instructions tailored to each environment.

For example, to run Zero123+SDS with hash3D

cd threestudio-hash3d
python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=https://adamdad.github.io/hash3D/load/images/dog1_rgba.png

Evaliation

  1. Image-to-3D: GSO dataset GT meshes and renderings can be found online. With the rendering of the reconstructed 3D objects at pred_dir and the gt rendering at gt_dir, run
python eval_nvs.py --gt $gt_dir --pr $pred_dir 
  1. Text-to-3D: Run all the prompts in assets/prompt.txt. And compute the CLIP score between text and rendered image as
python eval_clip_sim.py "$gt_prompt" $pred_dir --mode text

Acknowledgement

We borrow part of the code from DeepCache for feature extraction from diffusion models. We also thanks the implementation from threestudio, DreamGaussian, Gaussian-Dreamer, and the valuable disscussion with @FlorinShum and @Horseee.

Citation

@misc{yang2024hash3d,
      title={Hash3D: Training-free Acceleration for 3D Generation}, 
      author={Xingyi Yang and Xinchao Wang},
      year={2024},
      eprint={2404.06091},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

hash3d's People

Contributors

adamdad avatar eltociear avatar

Stargazers

chiefass avatar 计算机毕业设计 avatar inFinith avatar wv avatar Anoxia.Liu avatar  avatar  avatar 把辣条还给我(◐‿◑) avatar Peijie avatar  avatar 村口搬运工 avatar Arpan Tripathi avatar  avatar 离离 avatar maozi avatar sgp avatar houyushan avatar Amumu avatar hod avatar Lex avatar  avatar Jie Wang avatar Seokju Yun avatar huangshenneng avatar ipuke avatar Houxiao Guo avatar lulihua avatar Krtolica Vujadin avatar Ch avatar Shao avatar allenpeng avatar WZY99 avatar elucida avatar Jeff Carpenter avatar Liáng4793 avatar Cheng-Lin Tsai avatar Faych Chen avatar  avatar Cai Minghong avatar  avatar  avatar  avatar  avatar EyeSeeThru avatar  avatar 赵焕峰 avatar Zongrui Li avatar Junliang Ye avatar pe653 avatar  avatar afei avatar Paolo Faccini avatar JiaHao Lu avatar  avatar Yongcheng Jing avatar Paisk avatar Yue Pan  avatar  avatar Ziming Zhong avatar Shareef Ifthekhar avatar Haaan avatar Fatih BAŞATEMUR avatar Jinmo Kim avatar Hongwei Yi avatar Lu Ming avatar tomato avatar Stoney Kang avatar  avatar Shitty Girl avatar  avatar Mr.Alien avatar Huang avatar Hakeem Demi avatar  avatar  avatar  avatar Pengtao Chen avatar Chuanchen Luo avatar  avatar Lujia Jin avatar  avatar Yuan Shi avatar Crossingzebra avatar  avatar Harry Huang avatar hiyyg avatar Yuqi HU avatar Zigeng Chen avatar ZhiyuanthePony avatar  avatar Jingnan Gao avatar Jonathan Clark avatar Jinpeng Liu avatar  avatar  avatar Florin Shen avatar  avatar Gongfan Fang avatar  avatar Yunhan Yang avatar

Watchers

Snow avatar  avatar  avatar

hash3d's Issues

KeyError: 'zero123-unified-guidance-cache'

I run this command.

python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=load/images/dog1_rgba.png

I got this error.

$ python launch.py --config configs/stable-zero123_hash3d.yaml --train --gpu 0 data.image_path=load/images/dog1_rgba.png
Global seed set to 0
find:  single-image-datamodule
find:  zero123-system
find:  implicit-volume
find:  diffuse-with-point-light-material
find:  solid-color-background
find:  nerf-volume-renderer
[INFO] ModelCheckpoint(save_last=True, save_top_k=-1, monitor=None) will duplicate the last checkpoint saved.
[INFO] GPU available: True (cuda), used: True
[INFO] TPU available: False, using: 0 TPU cores
[INFO] IPU available: False, using: 0 IPUs
[INFO] HPU available: False, using: 0 HPUs
[INFO] You are using a CUDA device ('NVIDIA GeForce RTX 4090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
[INFO] single image dataset: load image load/images/dog1_rgba.png torch.Size([1, 128, 128, 3])
[INFO] single image dataset: load image load/images/dog1_rgba.png torch.Size([1, 128, 128, 3])
[INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[INFO] 
  | Name       | Type                          | Params
-------------------------------------------------------------
0 | geometry   | ImplicitVolume                | 12.6 M
1 | material   | DiffuseWithPointLightMaterial | 0     
2 | background | SolidColorBackground          | 0     
3 | renderer   | NeRFVolumeRenderer            | 0     
-------------------------------------------------------------
12.6 M    Trainable params
0         Non-trainable params
12.6 M    Total params
50.450    Total estimated model params size (MB)
[INFO] Validation results will be saved to outputs/zero123-sai-hash3d/[64, 128, 256]_dog1_rgba.png@20240418-085259/save
find:  zero123-unified-guidance-cache
Traceback (most recent call last):
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/launch.py", line 301, in <module>
    main(args, extras)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/launch.py", line 244, in main
    trainer.fit(system, datamodule=dm, ckpt_path=cfg.resume)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in fit
    call._call_and_handle_interrupt(
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 571, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 961, in _run
    call._call_lightning_module_hook(self, "on_fit_start")
  File "/home/dreamer/.local/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 146, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/threestudio/systems/zero123.py", line 40, in on_fit_start
    self.guidance = threestudio.find(self.cfg.guidance_type)(self.cfg.guidance)
  File "/home/dreamer/threestudio/custom/hash3D/threestudio-hash3d/threestudio/__init__.py", line 33, in find
    return __modules__[name]
KeyError: 'zero123-unified-guidance-cache'

What's the problem ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.