justimyhxu / grm Goto Github PK

View Code? Open in Web Editor NEW

510.0 510.0 29.0 101.72 MB

Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

Home Page: https://justimyhxu.github.io/projects/grm/

grm's People

Contributors

Stargazers

Watchers

Forkers

camenduru peanutcocktail anthonyyuan justinjohn0306 imclab bluewhiteheart navezjt x-ck-x deniskochetov zhiweicoding wzj11 0iui0 defe41251135 hhy5277 sorcererq whuhxb shimomurakei

grm's Issues

What is this license? How long does it take to generate a 3D model when running on the CPU

Real Cases

I compared almost all of SOTA algorithms.
GRM and TripoSR(LRM++), (maybe includes One2345++) are better than others.
But, still on the way ...

Demo not working anymore

Hi nice work! But the demo isn't working anymore.
I get this error: upstream connect error or disconnect/reset before headers. retried and the latest reset reason: connection termination

Pretrained model release

Thanks for the great work! Is there an estimated date for the release of the inference models used on hugging face?

I am trying to inference the model with 4 input rendered images from me.
The rendering is from dififrent elevation other than 20, like 7,5,12,25 elevation.
the gaussian I get is not 100 aligned !
so do you think it wikll work with 4 diffirent elevation with 4 images ?
because all the examples from the diffusion models is with +- similar elevation of 20.

in Experiemnt you wrote:
"Following [46], we filter 100k high-quality objects, and render 32 images at
random viewpoints with a fixed 50◦ field of view under ambient lighting."

so the random is with elevation ? do you train with 4 diffirent elevation in the same sample ?

thanks a lot.

Code of Ray generation in plucker_embedder

hi, thanks for your great work!

I have a question about the ray generation here . Since fxfycxcy=[0.5/tan_half_fov, 0.5/tan_half_fov, 0.5, 0.5], the code should be like this?

x = (x+0.5) / w
y = (y+0.5) / h
x = (x - fxfycxcy[:, 2:3]) / fxfycxcy[:, 0:1]
y = (y - fxfycxcy[:, 3:4]) / fxfycxcy[:, 1:2]

instead of

x = x / w
y = w / h
x = (x + 0.5 - fxfycxcy[:, 2:3]) / fxfycxcy[:, 0:1]
y = (y + 0.5 - fxfycxcy[:, 3:4]) / fxfycxcy[:, 1:2]

How many objects used? How many views rendered for every object?

algorithms: LRM， TripoSR, GRM, One2345++, MVDream, ....

datasets: objaverse, Omniobject, .... (even 2D image and video)

for example:
objaverse: 10M objects
camera: 100 views
every images: 512x512xRGBA
Huuuuuuuuuuuuuuge!!!

how many object/views used by these authors?

Why do most objects that the code creates look like cartoons?

How many additional views used for each step of training ?

Is there any way to use 4 real photos with RMBG

Tried to replace 4 images which produced via zero123 and etc by real photos with removed background and got a very bad gaussian. I suppose the reason is in camera position definition, but maybe there is way to do it fast.

Pics and video below

eleph.1.mp4

Training dataloader script?

Thank you for your amazing work！

I am trying to fine-tune the GRM model with more data, but I cannot find any information regarding the training dataloader. Can you provide some info about the input data to model specifically camera extrinsics and intrinsics?

CUDA OOM on A100 with SV3D

Tried allocating 67371012.07 GiB which seemed a bit excessive :)

Traceback (most recent call last):
File "Dev/GRM/test.py", line 585, in
main(args)
File "Dev/GRM/test.py", line 508, in main
sv3d_gs(
File "Dev/GRM/test.py", line 472, in sv3d_gs
images2gaussian(images, c2ws, fxfycxcy, grm_model, f'./{cache_dir}/{name}_gs.ply', f'{cache_dir}/{name}.mp4', f'{cache_dir}/{name}_mesh.ply', fuse_mesh=fuse_mesh)
File "Dev/GRM/test.py", line 242, in images2gaussian
gs_rendering = model.gs_renderer.render(latent=gs,
File ".local/lib/python3.10/site-packages/torch/cuda/amp/autocast_mode.py", line 121, in decorate_fwd
return fwd(*_cast(args, cast_inputs), **_cast(kwargs, cast_inputs))
File "Dev/GRM/model/render/gaussian_renderer.py", line 40, in render
renderings, depths, alphas = deferred_bp(xyz, features, scaling, rotation,
File "Dev/GRM/model/render/deferred_bp.py", line 163, in deferred_bp
return DeferredBP.apply(
File ".local/lib/python3.10/site-packages/torch/autograd/function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "Dev/GRM/model/render/deferred_bp.py", line 72, in forward
render_results = render(pc, patch_size, patch_size, C2W[i, j], new_fxfycxcy)
File "Dev/GRM/model/render/gaussian_utils.py", line 653, in render
rendered_image, radii, rendered_depth, rendered_alpha = rasterizer(
File ".local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File ".local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File ".local/lib/python3.10/site-packages/diff_gaussian_rasterization/init.py", line 210, in forward
return rasterize_gaussians(
File ".local/lib/python3.10/site-packages/diff_gaussian_rasterization/init.py", line 32, in rasterize_gaussians
return _RasterizeGaussians.apply(
File ".local/lib/python3.10/site-packages/torch/autograd/function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File ".local/lib/python3.10/site-packages/diff_gaussian_rasterization/init.py", line 92, in forward
num_rendered, color, radii, geomBuffer, binningBuffer, imgBuffer = _C.rasterize_gaussians(*args)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 67371012.07 GiB. GPU 0 has a total capacity of 39.39 GiB of which 34.88 GiB is free. Including non-PyTorch memory, this process has 4.50 GiB memory in use. Of the allocated memory 3.68 GiB is allocated by PyTorch, and 269.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Sparse-view reconstruction

Thank you for your excellent work! From your paper, I can see that there is still a section on sparse-view reconstruction. Where is the code for this section?

Code and models release

Thank you for your amazing work！

I'd like to know why the released codes and weights are withdraw, and what time will it be released again?

Mesh Extration

I would like to ask some details about mesh extration with tsdf fusion, what is the range of depth values per pixel for the depth map you are using? I.e. the range of depth values used in tsdf fusion.

Code and Models

Hi,

Thanks for your great work! Do you plan to release the model and the training code (and the training data)?

Test code for text-to-3D

Awesome work! It's good to have an open-sourced version of Instant-3D with gaussian. I'm currently also working on text-to-3D and would like to compare with your results and other SOTA methods. Could you share the code or implementation details of testing text-to-3D by CLIP-R, CLIP score and AP like number of views and CLIP model version? I believe your test method is different from shape-e... Thank you again for this ground-breaking open-source project!