Giter Club home page Giter Club logo

grm's People

Contributors

justimyhxu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grm's Issues

Real Cases

I compared almost all of SOTA algorithms.
GRM and TripoSR(LRM++), (maybe includes One2345++) are better than others.
But, still on the way ...

Wizard
image

DashaTaran
image

Game
image

SavannaBlade
image

Demo not working anymore

Hi nice work! But the demo isn't working anymore.
I get this error: upstream connect error or disconnect/reset before headers. retried and the latest reset reason: connection termination

Pretrained model release

Thanks for the great work! Is there an estimated date for the release of the inference models used on hugging face?

training data elevation

Hey,
Thanks for open the weights !!

I am trying to inference the model with 4 input rendered images from me.
The rendering is from dififrent elevation other than 20, like 7,5,12,25 elevation.
the gaussian I get is not 100 aligned !
so do you think it wikll work with 4 diffirent elevation with 4 images ?
because all the examples from the diffusion models is with +- similar elevation of 20.

in Experiemnt you wrote:
"Following [46], we filter 100k high-quality objects, and render 32 images at
random viewpoints with a fixed 50◦ field of view under ambient lighting."

so the random is with elevation ? do you train with 4 diffirent elevation in the same sample ?

thanks a lot.

Code of Ray generation in plucker_embedder

hi, thanks for your great work!

I have a question about the ray generation here . Since fxfycxcy=[0.5/tan_half_fov, 0.5/tan_half_fov, 0.5, 0.5], the code should be like this?

x = (x+0.5) / w
y = (y+0.5) / h
x = (x - fxfycxcy[:, 2:3]) / fxfycxcy[:, 0:1]
y = (y - fxfycxcy[:, 3:4]) / fxfycxcy[:, 1:2]

instead of

x = x / w
y = w / h
x = (x + 0.5 - fxfycxcy[:, 2:3]) / fxfycxcy[:, 0:1]
y = (y + 0.5 - fxfycxcy[:, 3:4]) / fxfycxcy[:, 1:2]

How many objects used? How many views rendered for every object?

How many objects used? How many views rendered for every object?

algorithms: LRM, TripoSR, GRM, One2345++, MVDream, ....

datasets: objaverse, Omniobject, .... (even 2D image and video)

for example:
objaverse: 10M objects
camera: 100 views
every images: 512x512xRGBA
Huuuuuuuuuuuuuuge!!!

how many object/views used by these authors?

Is there any way to use 4 real photos with RMBG

Tried to replace 4 images which produced via zero123 and etc by real photos with removed background and got a very bad gaussian. I suppose the reason is in camera position definition, but maybe there is way to do it fast.

Pics and video below
image_ingested_0 (1)
image_ingested_1 (1)
image_ingested_2 (1)
image_ingested_3 (1)

eleph.1.mp4

Training dataloader script?

Thank you for your amazing work!

I am trying to fine-tune the GRM model with more data, but I cannot find any information regarding the training dataloader. Can you provide some info about the input data to model specifically camera extrinsics and intrinsics?

CUDA OOM on A100 with SV3D

Tried allocating 67371012.07 GiB which seemed a bit excessive :)

Traceback (most recent call last):
File "Dev/GRM/test.py", line 585, in
main(args)
File "Dev/GRM/test.py", line 508, in main
sv3d_gs(
File "Dev/GRM/test.py", line 472, in sv3d_gs
images2gaussian(images, c2ws, fxfycxcy, grm_model, f'./{cache_dir}/{name}_gs.ply', f'{cache_dir}/{name}.mp4', f'{cache_dir}/{name}_mesh.ply', fuse_mesh=fuse_mesh)
File "Dev/GRM/test.py", line 242, in images2gaussian
gs_rendering = model.gs_renderer.render(latent=gs,
File ".local/lib/python3.10/site-packages/torch/cuda/amp/autocast_mode.py", line 121, in decorate_fwd
return fwd(*_cast(args, cast_inputs), **_cast(kwargs, cast_inputs))
File "Dev/GRM/model/render/gaussian_renderer.py", line 40, in render
renderings, depths, alphas = deferred_bp(xyz, features, scaling, rotation,
File "Dev/GRM/model/render/deferred_bp.py", line 163, in deferred_bp
return DeferredBP.apply(
File ".local/lib/python3.10/site-packages/torch/autograd/function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "Dev/GRM/model/render/deferred_bp.py", line 72, in forward
render_results = render(pc, patch_size, patch_size, C2W[i, j], new_fxfycxcy)
File "Dev/GRM/model/render/gaussian_utils.py", line 653, in render
rendered_image, radii, rendered_depth, rendered_alpha = rasterizer(
File ".local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File ".local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File ".local/lib/python3.10/site-packages/diff_gaussian_rasterization/init.py", line 210, in forward
return rasterize_gaussians(
File ".local/lib/python3.10/site-packages/diff_gaussian_rasterization/init.py", line 32, in rasterize_gaussians
return _RasterizeGaussians.apply(
File ".local/lib/python3.10/site-packages/torch/autograd/function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File ".local/lib/python3.10/site-packages/diff_gaussian_rasterization/init.py", line 92, in forward
num_rendered, color, radii, geomBuffer, binningBuffer, imgBuffer = _C.rasterize_gaussians(*args)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 67371012.07 GiB. GPU 0 has a total capacity of 39.39 GiB of which 34.88 GiB is free. Including non-PyTorch memory, this process has 4.50 GiB memory in use. Of the allocated memory 3.68 GiB is allocated by PyTorch, and 269.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Sparse-view reconstruction

Thank you for your excellent work! From your paper, I can see that there is still a section on sparse-view reconstruction. Where is the code for this section?

Code and models release

Thank you for your amazing work!

I'd like to know why the released codes and weights are withdraw, and what time will it be released again?

Mesh Extration

I would like to ask some details about mesh extration with tsdf fusion, what is the range of depth values per pixel for the depth map you are using? I.e. the range of depth values used in tsdf fusion.

Code and Models

Hi,

Thanks for your great work! Do you plan to release the model and the training code (and the training data)?

Test code for text-to-3D

Awesome work! It's good to have an open-sourced version of Instant-3D with gaussian. I'm currently also working on text-to-3D and would like to compare with your results and other SOTA methods. Could you share the code or implementation details of testing text-to-3D by CLIP-R, CLIP score and AP like number of views and CLIP model version? I believe your test method is different from shape-e... Thank you again for this ground-breaking open-source project!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.