vita-group / lightgaussian Goto Github PK

"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang

Home Page: https://lightgaussian.github.io/

License: Other

Python 92.26% Shell 4.53% C++ 0.34% Cuda 2.73% C 0.15%

3d-reconstruction efficient-inference gaussian-splatting

lightgaussian's Introduction

LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS

User Guidance

Gaussian Prune Ratio, Vector Quantization Ratio vs. FPS, SSIM

Mild Compression Ratio, with Minimum Accuracy Degradation

Setup

Local Setup

The codebase is based on gaussian-splatting

The used datasets, MipNeRF360 and Tank & Temple, are hosted by the paper authors here.

For installation:

conda env create --file environment.yml
conda activate lightgaussian

note: we modified the "diff-gaussian-rasterization" in the submodule to get the Global Significant Score.

Compress to Compact Representation

Lightgaussian includes 3 ways to make the 3D Gaussians be compact

Option 1 Prune & Recovery

Users can directly prune a trained 3D-GS checkpoint using the following command (default setting):

bash scripts/run_prune_finetune.sh

Users can also train from scratch and jointly prune redundant Gaussians in training using the following command (different setting from the paper):

bash scripts/run_train_densify_prune.sh

note: 3D-GS is trained for 20,000 iterations and then prune it. The resulting ply file is approximately 35% of the size of the original 3D-GS while ensuring a comparable quality level.

Option 2 SH distillation

Users can distill 3D-GS checkpoint using the following command (default setting):

bash scripts/run_distill_finetune.sh

Option 3 VecTree Quantization

Users can quantize a pruned and distilled 3D-GS checkpoint using the following command (default setting):

bash scripts/run_vectree_quantize.sh

Render

Render with trajectory. By default ellipse, you can change it to spiral or others trajectory by changing to corresponding function.

python render_video.py --source_path PATH/TO/DATASET --model_path PATH/TO/MODEL --skip_train --skip_test --video

For render after the Vectree Quantization stage, you could render them through

python render_video.py --load_vq

Example

An example ckpt for room scene can be downloaded here, which mainly includes the following several parts:

point_cloud.ply —— Pruned, distilled and quantized 3D-GS checkpoint.
extreme_saving —— Relevant files obtained after vectree quantization.
imp_score.npz —— Global significance used in vectree quantization.

TODO List

Upload module 1: Prune & recovery
Upload module 2: SH distillation
Upload module 3: Vectree Quantization
Upload docker image

Acknowledgements

We would like to express our gratitude to Yueyu Hu from NYU for the invaluable discussion on our project.

BibTeX

If you find our work useful for your project, please consider citing the following paper.

@misc{fan2023lightgaussian, 
title={LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS}, 
author={Zhiwen Fan and Kevin Wang and Kairun Wen and Zehao Zhu and Dejia Xu and Zhangyang Wang}, 
year={2023},
eprint={2311.17245},
archivePrefix={arXiv},
primaryClass={cs.CV} }

lightgaussian's People

Contributors

Stargazers

Watchers

lightgaussian's Issues

Gaussian co-adaptation

Congrats for this great paper, many interesting ides.

Can you pls elaborate what you mean by Gaussian co-adaptation? In the paper you only say it is essential, and that involves a joint adjustment of the Gaussians’ attributes. What do you do exactly?

multithread add operation in kernel 'renderCUDA_count'

Hi guys, I am wondering if the following accumulation in renderCUDA_count kernel shall use atomic operation :

As far as I know, the collected_id[j] in different parallel threads may run into idential value. Thanks.

Request an example

Hi there, a newbie here.

I am so impressed with your excelent work.

I was wondering if i can get an example model which is compressed with your method so i can test it on Unity and see what FPS i can get with runtime unity render? That would be appreciated.

example ckpt for room scene

Hello, I would like to inquire whether the example checkpoint you provided for the room scene has undergone pruning, SH distillation, and Vectree Quantization processes.
I noticed that the point cloud size in the file is 77MB, but Table 2 in the paper indicates that the model size after Vectree Quantization is 20MB.
Could you kindly clarify if there might be a misunderstanding on my part?
Thank you for your assistance.

nvrtc: error: invalid value for --gpu-architecture (-arch)

when I try to train my dataset with train_densify_prune.py ,I encounter the error when the processing to 60% .
my machine is unbuntu 20.04 and RTX 4090 GPU
i execute command is python train_densify_prune.py -s /root/autodl-tmp/script/test_light --prune_percent 0.6 --prune_decay 0.6 --prune_iterations 20000 --v_pow 0.1 --eval

I saw the similar issue，but i think they are different
would you like to help me.

Pass existing gaussian splatting checkpoint into prune script

Hello, how do we provide the argument to pass our existing gaussian splatting checkpoint in ply to the pruning script?

How to get the Global Significant Score？

I want to modify diff-gaussian-rasterization. How to get the Global Significant Score？

How to calculate metrics for the quantized model?

Thank you for your work.

After quantization, the model folder contains only one folder named extreme_saving and a point_cloud.ply file. I tried to use this folder directly with the command

python render_video.py --load_vq

But it gives an error saying it can't find the cfg_args folder.

How can I resolve this problem?

new

imp_score.npz ?

Where is this file saved?

Question about importance score calculation

Hi, thanks for sharing your great work. I have a question regarding the calculation of gaussian importance score. In your paper Eq4, what's sigma_j ? from your code I found it is the opacity of the gaussian

gaussian_count[collected_id[j]]++;
important_score[collected_id[j]] += con_o.w; //opacity

But may I ask why did you only use opacity instead of using alpha (opacity * probability from splatted gaussian) since the latter is the real contribution of the gaussian to a specific pixel?

BTW, there might be a mistake in paper at the begining of page 4:

Here, ci, αi represents the color and opacity of this point computed by a Gaussian with covariance Σ multiplied by an optimizable per-point opacity and SH color coefficients.

This sentense sounds weird because αi is not the opacity but the weight in alpha blending, and it's not related to SH

Regarding the size of the shared room example

Dear Authors,

First, I would like to thank you for the amazing work that you have done and the impressive ideas that you proposed to compress the 3D-GS model while maintaining reasonably good quality.

I was exploring the point_cloud.ply file that you shared here for the room model. It seems that the file size is about 77 MB. While in the current version (v4) of your paper, in Table 2, it is mentioned that the same model has a size of about 20 MB.

Similar information is also mentioned in Table 4:

My understanding from the tables is that, the model that you shared are only optimized via Pruning & Recovery and SH Distillation but they are not VecTree quantized. Could you please verify if that is correct? And if yes, could you please share the model where you applied all the three optimization steps?

Error about quantizing 3DGS checkpoint

Hi, I get an error when I run vectree.py.

================== Print Info ==================
Input_feats_shape: torch.Size([1554770, 62])
VQ_feats_shape: torch.Size([1554770, 27])
SH_degree: 2
Quantization_ratio: 0.6
Add_important_score: True
Codebook_size: 8192
================================================
IS_percent: tensor(0.7985)
100%|██████████| 1000/1000 [01:00<00:00, 16.41it/s]

=============== Start vector quantize ===============
100%|██████████| 190/190 [00:01<00:00, 186.61it/s]
updating: ../vectree/output/bicycle/extreme_saving/ (stored 0%)
updating: ../vectree/output/bicycle/extreme_saving/metadata.npz (deflated 12%)
updating: ../vectree/output/bicycle/extreme_saving/non_vq_feats.npz (deflated 0%)
updating: ../vectree/output/bicycle/extreme_saving/xyz.npz (deflated 0%)
updating: ../vectree/output/bicycle/extreme_saving/non_vq_mask.npz (deflated 0%)
updating: ../vectree/output/bicycle/extreme_saving/other_attribute.npz (deflated 0%)
updating: ../vectree/output/bicycle/extreme_saving/codebook.npz (deflated 0%)
updating: ../vectree/output/bicycle/extreme_saving/vq_indexs.npz (deflated 0%)
Size = 70.69165706634521 MB

==================== Load saved data & Dequantize ====================
Traceback (most recent call last):
File "/home/zxq/MachineLearning/SLAM/3DGS/LightGaussian/vectree/vectree.py", line 224, in
vq.dequantize()
File "/home/zxq/MachineLearning/SLAM/3DGS/LightGaussian/vectree/vectree.py", line 215, in dequantize
write_ply_data(dequantized_feats.cpu().numpy(), self.ply_path, self.sh_dim)
File "/home/zxq/MachineLearning/SLAM/3DGS/LightGaussian/vectree/utils.py", line 101, in write_ply_data
elements[:] = list(map(tuple, feats))
ValueError: could not assign tuple of length 62 to structure with 41 fields.

I found that the shape of feats is [1554770, 62] and the dtype of elements is [1554770, 41], which caused the error. Besides, elementsdtype is defined by dtype_full：

dtype_full = [(attribute, 'f4') for attribute in construct_list_of_attributes()]

So, I'd like to know the reason why the demensions between dtype_full and feats is inconsistent.
Thanks in advance！

about Octree-based Compression

Dear Authors,

I'm curious to know whether the Octree-based Compression G-PCC mentioned in the paper is included in this GitHub repository?

If it is, could you kindly direct me to the specific path where I can locate it?

If not, would it be possible for you to provide it in the future?
Alternatively, could you offer some guidance on how to access or utilize it?

Thank you very much for your assistance.

20,000 iterations and then prune?

In README:
note: 3D-GS is trained for 20,000 iterations and then prune it.

where can i change 20,000 to a smaller number, like 12,000

Instructions for pruning pre-trained "checkpoint" model

Hi thanks for the great code.

Please could you provide clear instructions for pruning a pre-trained INRIA "checkpoint" model .ply

First off, Checkpoints and Trained models seem to be referenced differently in LightGaussian compared the INRIA code.
Pre-trained INRIA models .ply are being referred to as a checkpoint, this is not a checkpoint in the INRIA code.
INRIA checkpoints are saved with --checkpoint_iterations 1000 and produce chkpnt1000.pth

To run pruning on a pre-trained INRIA "checkpoint" .ply the instructions for LightGaussian say;

Users can directly prune a trained 3D-GS checkpoint using the following command (default setting):

bash scripts/run_prune_finetune.sh

Should I add the arguments;

-s path-to-model-folder/

(full path to trained model folder e.g. -s datasets/big which contains /point_cloud/iteration_30000/point_cloud.ply)

And -m path-to-output-folder/
e.g.
bash scripts/run_prune_finetune.sh -s datasets/big -m datasets/small ?

In the scripts run_prune_finetune.sh and run_prune__pt_finetune.sh they reference run args for datasets.
In run_prune__pt_finetune.sh a comments says;
# This is an example script to load from ply file.
So should I use this to point directly to .ply file?

bash scripts/run_prune_pt_finetune.sh -datasets/big --start_pointcloud datasets/big/point_cloud/iteration_30000/point_cloud.ply -m datasets/small

I also tired added "big" as an argument to the script.

All these tests failed.

Any help is much appreciated!

When will SH distillation be open sourced?

Question about rendering

Hi, thanks for sharing your great work.
About rendering, I'd like to ask about the generalizability of your work.
According to code in render_video.py

If I use command 'python render_video.py' without --video, then the operation of rasterization is the same as that of 3DGS, right?
Is it correct that if I only do inference I can render it offline?
In other words, for inference, the calculation of important scores is unnecessary, and there is no additional calculation comparing with 3DGS.

Hardware Requirements

Hi, I'm curious about the hardware requirements for the experiment.
I've been using a 4090 GPU with 128GB of VRAM to run the prune_finetune.py script on the Garden dataset.
However, I'm encountering out-of-memory errors when setting the iteration to 5000.
What sort of hardware specifications would be necessary to achieve training quality comparable to the standards outlined in the paper?
Thanks!

Getting 'tensor object is not callable'

I'm running train_densify_prune on a scene of my own. After about 15k iterations i get:

Traceback (most recent call last):
  File "/root/ma-ws23_24-berzan-yildiz-remote-computer-vision/LightGaussian/train_densify_prune.py", line 266, in <module>
    training(
  File "/root/ma-ws23_24-berzan-yildiz-remote-computer-vision/LightGaussian/train_densify_prune.py", line 202, in training
    v_list = calculate_v_imp_score(gaussians, imp_list, args.v_pow)
  File "/root/ma-ws23_24-berzan-yildiz-remote-computer-vision/LightGaussian/prune.py", line 120, in calculate_v_imp_score
    volume = torch.prod(gaussians.get_scaling(), dim=1)
TypeError: 'Tensor' object is not callable

I run it with python train_densify_prune.py -s dataset on a dataset converted with convert.py.

Any ideas what could be going wrong?

questions about interactive reviewer

hi, thanks for your great work, how to do interactive review with your compressed model? @Kevin-2017
Looking forward to your reply, thanks!

Errors that occur when training on the room dataset.

Great job! But I encountered a problem while retraining on the dataset in the room. Are you also using the dataset downloaded from the link you provided?
python train_densify_prune.py -s /home/wxs/3D_Recon/LightGaussian/datasets/room -m /home/wxs/3D_Recon/LightGaussian/output/room
Optimizing /home/wxs/3D_Recon/LightGaussian/output/room
Output folder: /home/wxs/3D_Recon/LightGaussian/output/room [18/01 10:09:28]
Tensorboard not available: not logging progress [18/01 10:09:28]
/home/wxs/3D_Recon/LightGaussian/datasets/room [18/01 10:09:28]
Reading camera 1/30Traceback (most recent call last):
File "/home/wxs/3D_Recon/LightGaussian/train_densify_prune.py", line 268, in
training(
File "/home/wxs/3D_Recon/LightGaussian/train_densify_prune.py", line 57, in training
scene = Scene(dataset, gaussians)
File "/home/wxs/3D_Recon/LightGaussian/scene/init.py", line 55, in init
scene_info = sceneLoadTypeCallbacks["Colmap"](
File "/home/wxs/3D_Recon/LightGaussian/scene/dataset_readers.py", line 179, in readColmapSceneInfo
cam_infos_unsorted = readColmapCameras(
File "/home/wxs/3D_Recon/LightGaussian/scene/dataset_readers.py", line 106, in readColmapCameras
assert (
AssertionError: Colmap camera model not handled: only undistorted datasets (PINHOLE or SIMPLE_PINHOLE cameras) supported!

Running with custom dataset

Hey, do you have pointers on how to run this with a custom dataset?

Experiment Settings For the FPS Measurement

Dear LightGaussian Authors,

I am interested in understanding the specifics of your FPS measurements.
Could you please specify the device used for these measurements?

Thank you!