alextrevithick / grf Goto Github PK
View Code? Open in Web Editor NEW🔥 General Radiance Field (ICCV, 2021)
🔥 General Radiance Field (ICCV, 2021)
my own data are captured by multiple different cameras with different intrinsics!
thank you!
I'm using the positional encoder found in the NERF paper to encode my images after stacking the view points on the colors as mentioned in the paper however I'm unable to get the shapes to line up for input into the CNN. In NERF's implementation they flatten before sending their inputs to encoder.
To give a more concrete example of what I'm talking about here is my code
# reshape inputs to [20, 378, 504, 6] concatenating view to colorspace
inputs = torch.tensor(np.concatenate([images, np.broadcast_to(np.expand_dims(C, (1,2)), images.shape)], axis=-1))
# create embedder with length 5 as specified in the paper
embed, input_ch = get_embedder(5, 0)
# flatten. not sure if this step is required shape of [3810240, 6]
inputs_flat = torch.reshape(inputs, [-1, inputs.shape[-1]])
# apply embedding for a output shape of [3810240, 66]
embedding = embed(inputs)
Not really sure where to go from here to submit to the CNN.
Hi, thanks for the great work!
I have some questions:
I'm rather confused on this section because you cite P as this function based on multi view geometry and then describe two approximations. Do these approximations represent P? Also I am confused about how to implement these approximations outside of checking inside and outside of the image, specifically how do you "duplicate its features to the 3D point"?
I noticed that when using the shapenet data set for training, your dataset loading module uses path like “train” and "train_val", but this is inconsistent with the raw dataset which Vincent provides. May I ask how do you organize your project’s dataset folders? Thank you in advance.
May I ask how to get the intrinsic matrix of a photo if I want to use my own data to train GRF? And without the intrinsic matrix, will the performance of GRF significantly degrade?
My GPU has 24GB.
I decrease parameters as you said.
--chunk [number of rays processed in parallel, decrease if running out of memory] --netchunk [number of pts sent through network in parallel, decrease if running out of memory]
But it is useless even if these two parameters are set to one.
Can it generate a 3D surface similar to PIFu?
In the figure 6 in section 3.5, the input of the MLP is 3D point feature and viewpoint (x,y,z) (correspond to the 3-D posotion in the classical NeRF). I wonder whether the 2-D direction is required is needed for the input of the MLP?
Hey, thanks for showing the great work.
I have a question on Figure 4,
"3) To further demonstrate the advantage of GRF over SRNs, we directly evaluate the trained SRNs model on unseen objects (of the same category) without retraining. For comparison, we also directly evaluate the trained GRF model on the same novel objects. Figure 4 shows the qualitative results. It can be seen that if not retrained, SRNs completely fails to".
According to that the SRNs model does not train on unseen objects, the latent code z for unseen object is not optimized, so is it randomly initialized? Then, how can it generates novel views of unseen cars similar to GT views? My understanding is that the randomly initialized latent code z may generate unpredictable cars, but similar to training set, which seems to be conflict with the above quoted words. It confuses me for hours.
Hi, Alex
Can you show me how much time is needed for training on the ShapeNetv2 and other datasets?
Thanks for sharing this very interesting work! Do you have an estimate when the code will be released? I'm thinking about whether I should wait or start implementing myself.
Hi, Alex
I notice that different CNN models are used for different datasets. I wonder if there were some special considerations when designing the CNNs. And if I want to design a CNN for my dataset, what should I pay attention to?
Thanks!
Dear author,
Can you provide pre-trained models on NeRF dataset?
Can you give us more details of training models (Group1) on nerf dataset?
Thanks
Hello! thanks for sharing your great art work!
I wonder how to get the rendering result of the unseen category/scene.
according to 4.3, it says "We train a single model on randomly selected 4 scenes, i.e., Chair, Mic, Ship, and Hotdog, ...." and I wonder how to train several classes in each image batch(like all Nerf Synthetic datasets together to get the generalized GRF model). I think there are configs which contain only one single class for each config text file.
I guess, if a batch contains 4 classes and each class has 8 views, then the batch has 32 images in total. Or, a batch could contain 32 images of only one single class and it sequencially see all classes one by one.
Can you describe your train configuration for generalization in more detail?
Plus, If 2 views or 6 views are fed, only 2/6images are input? or corresponding poses are also required?
or can you provide us all the required data for section 4.1~4.3 if possible(it must be the best for me to understand it clearly)?
Please give me some hint. ;)
Cheers!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.