apchenstu / tensorf Goto Github PK
View Code? Open in Web Editor NEW[ECCV 2022] Tensorial Radiance Fields, a novel approach to model and reconstruct radiance fields
License: MIT License
[ECCV 2022] Tensorial Radiance Fields, a novel approach to model and reconstruct radiance fields
License: MIT License
Thank you for the code. Could you please add some comments what the arguments to the model mean? in TensorBase.
Great work and an inpsiring read!
Just a small pointer - the link points to the MVSNeRF paper instead of yours...
Has anyone tested it on DTU? What values should be used for near_far and bbox for DTU? I have tried several values, and got abnormal results everytime.
Hi Anpei! Thank you very much for the outstanding work.
I have some problems training the DTU dataset. For the dtu_83 scene, I mask out the background and then use 58 images for training and 6 for testing. The testing results are poor as below.
I notice that in the opt.py file, 'dtu' is one of the choices of the dataset_name
argument. Have you trained on the DTU dataset before? Could you give some instructions?
Hi, thanks for the great work! I'm reading the code and find something weird. In these lines, you're computing the TV loss. At line203, loss_tv
is first computed on density components, and then added to total_loss
at line 204. However, at line207, you add the appearance component TV loss to the previous density TV loss:
loss_tv = loss_tv + tensorf.TV_loss_app(tvreg)*TV_weight_app
and add them together to the total_loss
at line208:
total_loss = total_loss + loss_tv
If I understand correctly, you will punish density TV loss twice by doing so. While I understand we can adjust its loss weight to mitigate this issue, maybe you want to fix this.
Another problem is, you are doing decay of TV loss weight by TV_weight_density *= lr_factor
. Did you find this better compared to e.g. using a consistent TV loss weight?
The work is great, I have benefited a lot from it
In the paper, Eqns 1-5, 7-10 contain the parameter R
(R_\sigma
, R_c
) which indicates the number of components (for both CP and VM decompositions). Which variable in the configs actually refers to this parameter R
?
Hi,
I cannot find the implementation of function eval():
eval(args.model_name)
Can you give me some reference to the related code?
Hi, thank you for your code. It's all black when I open the video. Could you please give me some help?
Hello,
I had a question regarding how do you calculate the CP decomposition? Is the way you calculate it in your code similar to the SVD method provided in numpy or torch?
Another question is does the geometry grid or density get calculated without any optimization? In the code it doesn’t seem to be passing ong through an MLP and I was wondering how is the density learned by this model?
thanks for the amazing work!
Hi,
Great work!
I am wondering for my own dataset, is it possible to support different intrinsic for each camera?
In instant-ngp, we are able to do it by modify the transforms.json: NVlabs/instant-ngp#797
In your code I found that by design it is for only one camera, for example, the read_meta function
TensoRF/dataLoader/your_own_data.py
Line 39 in 17deeed
Any suggestions? Thanks a lot!
Hi thanks a lot for releasing the code for the nice work
distance_scale=25
is used at train and test time for rendering, but not for alpha mask update or mesh extraction. I can't seem to find discussion regarding this hyperparameter in the main text, and I wonder whether you could provide a little more explanations.
It appears that using distance_scale
for mesh extraction results in noisier geometry with more floaters, while not applying distance_scale
during training would lead to divergence. Thanks again.
Hello @apchenstu,
I was wondering if it might be possible to render 3D models using TensoRF in real-time? I don't know if this is the direction your team was planning to go in with your paper. But I am curious if it would be possible to build something similar to what Google built for SNeRG: website link
Thanks for sharing this work with the community!
Using the conda and pip setup commands from the readme leaves me with this error:
(TensoRF) snellius paulm@gcn37 11:52 ~/c/TensoRF$ python train.py --config configs/steps_with_stuff.txt
Traceback (most recent call last):
File "train.py", line 9, in <module>
from renderer import *
File "/gpfs/home4/paulm/c/TensoRF/renderer.py", line 5, in <module>
from utils import *
File "/gpfs/home4/paulm/c/TensoRF/utils.py", line 159, in <module>
import plyfile
ModuleNotFoundError: No module named 'plyfile'
Add with pip install plyfile
seems to fix the issues.
Hello! For NeRF and NVSF datasets I can train normally, but when I do Tank&Template training, it keeps reporting killed, and by checking the logs, I found that it is because of "out of memory". Later, by checking the memory usage through "top", I found that the memory is indeed overflowing, but the gpu memory is still largely remaining.
hi @apchenstu, how can we visualize the values of the tensors and vectors as shown in the train_process.mp4?
Like Semanic NeRF does.
I use export_mesh. I want to get mesh with color, so what should I do? I notice that author uses marching_cubes to get verts, so have any idea to set verts with color? @apchenstu
thanks
Hi @apchenstu, I am trying to train TensoRF with a ScanNet scene. I calculate the scene bounding box using the scene mesh provided in the dataset. For example, for scene0000_00
in ScanNet, the bounding box can be computed to be [[-1.0176, -1.0018, -1.0003],[11.3742, 9.7380, 4.0293]]
. Although, the TensoRF performance on the test set is quite bad. The visualizations look something like this:
You can see that it can render the chair legs (see bottom left), but overall rendering is quite bad. I tried different far
values (e.g. 5.0, 10.0, 100.0) but none of them seem to work with TensoRF.
But when I trained a NeRF model with Instant-NGP, far value of 10.0 had worked. So, I am not sure what I am missing here. Can you please advise?
Interestingly, training PSNR reaches a high value of 25, but test PSNR is very low around 9 or 10.
Thanks,
Yash
Great work!
I just wanted to try it out on the NeRF synthetic Lego dataset, but got a RuntimeError
.
Running python train.py --expname lego --datadir ~/data/nerf/nerf_synthetic/lego
yields
Traceback (most recent call last):
File "train.py", line 303, in <module>
reconstruction(args)
File "train.py", line 169, in reconstruction
rgb_map, alphas_map, depth_map, weights, uncertainty = renderer(rays_train, tensorf, chunk=args.batch_size,
File "/home/ubuntu/workspace/TensoRF/renderer.py", line 16, in OctreeRender_trilinear_fast
rgb_map, depth_map = tensorf(rays_chunk, is_train=is_train, white_bg=white_bg, ndc_ray=ndc_ray, N_samples=N_samples)
File "/home/ubuntu/anaconda3/envs/TensoRF/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/workspace/TensoRF/models/tensorBase.py", line 449, in forward
valid_rgbs = self.renderModule(xyz_sampled[app_mask], viewdirs[app_mask], app_features)
File "/home/ubuntu/anaconda3/envs/TensoRF/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/workspace/TensoRF/models/tensorBase.py", line 109, in forward
rgb = self.mlp(mlp_in)
File "/home/ubuntu/anaconda3/envs/TensoRF/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/TensoRF/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
input = module(input)
File "/home/ubuntu/anaconda3/envs/TensoRF/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ubuntu/anaconda3/envs/TensoRF/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (4x102 and 105x128)
Did I miss something?
I think there is a mistake when using downsample in the your_own_data.py data loader.
Shouldn't the principle point cx, cy should also be divided by self.downsample?
TensoRF/dataLoader/your_own_data.py
Line 48 in 17deeed
Hi, thanks to make this great code and paper! I really enjoy this.
In Tensorf paper,
appearance values(A_c(x)) are concatenated and then multiplied by appearance matrix B.
And then, sent this into the decoding function S for RGB color regression.
But in this code ,
Line 223 in 17deeed
From this line,
Line 239 in 17deeed
During this process, I can't find any appearance matrix B mentioned in the paper.
Is the self.basis_mat is matrix B?
If it is not, where is matrix B and what is the self.basis_mat?
Hi, I used the NGP way for preparing my own data, but the results are blurry after training. Could you please provides some hints about how to set the params in scene_bbox and near_far for specific dataset? Thanks
How much GPU memory is required for training? I am using RTX 2080, 11GB. I tried to train on the lego dataset using the config file provided, and I get a memory error.
Hi, excellent work!!!
The question is: Can I use configs/lego.txt for another pretrain model like drums.th or another pretrain model from nerf_synthetics?
Thank you.
I followed the command you provide,the code run successfully,but the speed is terrobily slow...
Iteration 0040: train_psnr = 15.01 test_psnr = 0.00 mse = 0.025878: 0%| | 42/30000 [08:30<544:56:04, 65.48s/it]
And then I tried using
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
provided by h-OUS-e in the issues,but still got very low speed....
Iteration 00150: train_psnr = 21.94 test_psnr = 0.00 mse = 0.006579: 1%| | 160/30000 [10:27<25:26:26, 3.07s/it]
Hi Anpei,
thanks for the awesome work. I want to ask about a small implementation detail:
what is the reason/benefit for initializing the bias to zero here:
Line 72 in 3948381
Note that I think this is different to #2 as I'm just trying to render the examples without training them. I might be misunderstanding though.
I've downloaded the dataset and pretrained checkpoints for Synthetic Nerf.
python train.py --config configs/lego.txt --ckpt checkpoints/lego.th --render_only 1 --render_test 1
gives
size mismatch for basis_mat.weight: copying a param with shape torch.Size([27, 288]) from checkpoint, the shape in current model is torch.Size([27, 864]).
Hello,
Firstly thank you for sharing a very clear implementation of your work! It is great!
For some reason, my training is very slow and I was wondering if I must include something in the training command to achieve speeds as fast as the ones you show. Right now, it has been more than 15 minutes and I am on itteration 460 only, with a PSNR of 24.92 and the loading bar says 2%.
I am on Windows 10 with a 3080ti gpu. I followed the instructions for downloading and installing the packages in the conda environment as specified.
Any help is appreciated, thank you!
Hey team,
So I'm building a renderer for TensoRF, and work on optimizing the code in this repo so that it can run in real-time.
I wrote a vectorized implementation (I think) of the TensorVMSplit.compute_appfeature()
that leverages NumPy functions. I tested it on an AWS EC 2 instance (i.e. a g3s.xlarge), and it seems to run without error.
I'm just looking for feedback - is this a good direction to pursue? Do folks know if there are potentially easier/better ways to go about run this function w/o running into an out-of-memory error?
def compute_appfeature(self, xyz_sampled):
"""
Returns the appearance feature vectors for a set of XYZ locations.
Parameters:
xyz_sampled: multi-dimensional Tensor. Last dim should have a shape of 3.
Returns: multidimensional tensor.
Last dim will have same shape as data_dim_color
"""
def compute_factors(idx_plane, grid_mode='plane'):
"""
Helper function used to compute the factors used for
vector-matrix decomposition.
Parameters:
idx_plane (int): points to either the XY, XZ, or YZ planes
grid_mode (str): specifies whether we want a
matrix/vector factor
Returns: torch.Tensor: the factor needed for VM decomposition
"""
grid = None
if grid_mode == 'plane':
grid = coordinate_plane # defined below
else: # grid_mode == 'line'
grid = coordinate_line # defined below
input_plane = self.app_plane[idx_plane].cpu()
factor = F.grid_sample(
input_plane,
grid[[idx_plane]],
align_corners=True,
).view(-1, *xyz_sampled.shape[:1])
return factor
### MAIN CODE
xyz_sampled = xyz_sampled.to(device="cpu")
... # unchanged code
# figure out the vector-matrix outer products, trying vectorization
app_plane_indices = np.array(list(range(len(self.app_plane))))
compute_VM_factors = np.vectorize(compute_factors, otypes=[torch.Tensor])
plane_coef_point = compute_VM_factors(app_plane_indices, 'grid') # 1D np.ndarray with 2D Tensors
plane_coef_point = torch.cat(list(plane_coef_point)).to(device=self.device) # 2D Tensor
# same type of object as plane_coef_point
line_coef_point = compute_VM_factors(app_plane_indices, 'line') # same as above
line_coef_point = torch.cat(list(line_coef_point)).to(device=self.device)
return self.basis_mat((plane_coef_point * line_coef_point).T)
Hello there! First of all, thank you for open-sourcing your work!
I saw that the pretrained checkpoints are hosted on OneDrive – would you be interested in sharing your models on the Hugging Face Hub?
The Hub makes it easy to freely download and upload models, and it can make models more accessible and visible to the rest of the ML community. It's good way to share useful metadata and metrics, and we also support features like TensorBoard visualizations and PapersWithCode integrations. Since models are hosted as Git repos, they're also automatically versioned with a commit history and diffs. We could even help you set up an organization (e.g. see the Facebook AI or Stanford NLP organizations).
We have a step-by-step guide that explains the process for uploading the model to the Hub, in case you're interested. We also have a library for programmatic access to uploading and downloading models, which includes features like caching for downloaded models.
Please let us know if you have any questions, and we'd be happy to guide you through the process!
Nima and the Hugging Face team
Hello authors, thank you for your great work.
I am trying to train a model on the lego dataset with TensorCP, but it is not working. It trains, but the PSNR does not increase above 9, and then it errors out when updating the alpha mask:
...
initial TV_weight density: 0.0 appearance: 0.0
Iteration 02000: train_psnr = 9.34 test_psnr = 0.00 mse = 0.121434: 7%|█████▎ | 2000/30000 [00:31<07:25, 62.87it/s]
Traceback (most recent call last):
File "train.py", line 301, in <module>
reconstruction(args)
File "train.py", line 234, in reconstruction
new_aabb = tensorf.updateAlphaMask(tuple(reso_mask))
File "/users/lukemk/miniconda3/envs/new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/tritonhomes/lukemk/projects/experiments/active/TensoRF/models/tensorBase.py", line 330, in updateAlphaMask
xyz_min = valid_xyz.amin(0)
IndexError: amin(): Expected reduction dim 0 to have non-zero size.
For context, I can run with TensorVMSplit without any issues (and I get the expected PSNR). I probably just missed some parameter that needs to be passed to make TensorCP work.
Thanks again for your work!
Hello, this is a novel, nice work. I'm very surprised with the results and your idea.
But I have a small questions,
Why did you split the Radiance Field grid into density and appearance separately? Can we design a single factored grid and regress density and appearance using MLP?
Did you have any insight behind the using appearance vector(vector b_r in the paper)? As far as i know, original NeRF did not use additional appearance feature vector at all.
Thanks for sharing great code!
Thanks for your great work! I find that you have providec SHRender() in tensorBase.py. Is it possible to cash xyz_features and SH parameters like Plenoxels? Thanks in advance!
I want to apply this model to other datasets like ShapeNet, but I don't know the exact size of the scene bounding box and where the box should be placed because it seems not to be given in the dataset. I wonder how I get the value of the parameter “scene_bbox” in other datasets?
Hello Chen,
It seems that in renderer.py for function evaluation_path() you are not covering ray sampling in the same way you do separately for blender.py and llff.py data loader (in blender.py you do not use ndc_rays_blender, while in llff you do).
I think some quick fix to make args.render_path=1 be able to work for both loaders would be to go from:
if ndc_ray:
to
if ndc_ray and 'blender' not in str(type(test_dataset)):
Thank you for your great work!
BTW, when I run TensoRF on llff fern data, I got an error and solved it.
I want to report this.
The error was like this:
Traceback (most recent call last):
File "train.py", line 301, in
reconstruction(args)
File "train.py", line 225, in reconstruction
prtx=f'{iteration:06d}_', N_samples=nSamples, white_bg = white_bg, ndc_ray=ndc_ray, compute_extra_metrics=False)
File "/home/asc/PycharmProjects/nerf-pytorch/venv/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/asc/PycharmProjects/TensoRF/renderer.py", line 38, in evaluation
idxs = list(range(0, test_dataset.all_rays.shape[0], img_eval_interval))
ValueError: range() arg 3 must not be zero
So, I modified renderer.py like this
img_eval_interval = 1 if N_vis < 0 else test_dataset.all_rays.shape[0] // N_vis
if img_eval_interval == 0:
img_eval_interval = 1 # add these 2 lines
And it worked.
Hi, thank you for your great work!
I noticed that there is a function named convert_sdf_samples_to_ply in utils.py when I tried to extract mesh. Could you please tell me how do you get SDF from alpha? I did not find this in the paper.
Thank you for your time.
Hi, thanks for your great work!
I'm confused why the occupancy mask is conducted on alpha instead of sigma(density)?
Hi, @apchenstu
Thank you for sharing this really great idea!
It's very helpful for me to develop other ideas related to Radiance Fields.
By the way, I just notice that you might have wrong PSNR value by the in-place operator(+=).
You've assigned the image loss variable loss
to total_loss
, and then increase it with the +=
operator.
It will modify the original image loss loss
variable resulting in wrong PSNR; in fact, smaller than the true value.
You can check this at my colab.
Thank you,
Sangmin Kim.
Lines 190 to 198 in 17deeed
TensoRF looks like small and very fast traning, is that possible to make the final RGBD result merged into a single point cloud? If so, how? that would be very helpful if can generates a final point cloud.
Hi, I have created my own dataset but the results looks like this, do you have any idea why so?
The way I prepared my data is to capture ~200 images 360 degrees around a small object, then run colmap2nerf and split into train and test set. I probably need to segment out the object itself like tanks and temple dataset but didn't do so due to time constraint, but I used a clean white background. Training process takes around 1hour, and reported train psnr = 24 test psnr = 12 mse = 0.003
thanks!
Hello - I noticed there is a function getDenseAlpha()
in tensorBase.py
to output what the alpha values TensoRF would predict, in a dense 3D voxel grid.
I am wondering if it is a good idea to extend this function to also include outputting the RGB values for each of the cells in the grid (maybe call it getDenseRGBA()
)? I am thinking we could use such a grid in building a RT renderer (#7), in addition to having some kind of acceleration structure.
Do you perhaps have an idea on how this could work @apchenstu ?
Hello, I wanted to ask about a small implementation detail - what is stored in the test_dataset.render_path
variable? And how is it initialized?
Where this becomes an issue: when I pass --render_only 1 --render_test 1 --render_path 1
to train.py
(happy to give the full command if needed), it raises an AttributeError
on line 84:
c2ws = test_dataset.render_path
I think it because test_dataset.render_path
is None
for me.
I can at least see that test_dataset.render_path
is referenced in train.py
in both the reconstruction()
and render_test()
functions in train.py
, but I could not locate where exactly it is initialized in any of the dataLoader
classes.
Thank you for any insight on this.
Hi - hardly an issue, more some observations - you need to add kornia, lpips and tensorboard to your list of dependencies to run train.py. Also the version of PyTorch you install needs to play nicely with the installed local version of the CUDA Toolkit - if the installation fails you're pointed to a website with appropriate instructions so no need for those to be reproduced here. Thanks for sharing this fantastic work!
Hey, I am trying to understand this piece of code here. Would you mind explaining what exactly its trying to compute? What is rate_a/b & t_min. I might be rusty of on 3D geom
https://github.com/apchenstu/TensoRF/blob/main/models/tensorBase.py#L281-L284
Dear authors,
I trained the model on forward-facing real-noface dataset.
But the farthest contents are not learned.
Can you provide the configs that you used to train on that dataset?
Thanks!
Hi Anpei,
I ran a modified version of TensoRF on my own dataset only with 12~20 images as input. The convergence and the quality are really nice but I see quite a lot of floaters in empty space (most seem to stick to the boundary of the bounding box). In addition, the boundary of the object is quite noisy in some novel views. I wonder if you have met with these problems before and if you have any suggestions to fix this problem? Thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.