Giter Club home page Giter Club logo

neuray's Introduction

NeuRay

Rendered video without training on the scene.

Todo List

  • Generalization models and rendering codes.
  • Training of generalization models.
  • Finetuning codes and finetuned models.

Usage

Setup

git clone [email protected]:liuyuan-pal/NeuRay.git
cd NeuRay
pip install -r requirements.txt
Dependencies
  • torch==1.7.1
  • opencv_python==4.4.0
  • tensorflow==2.4.1
  • numpy==1.19.2
  • scipy==1.5.2

Download datasets and pretrained models

  1. Download processed datasets: DTU-Test / LLFF / NeRF Synthetic.
  2. Download pretrained model NeuRay-Depth and NeuRay-CostVolume.
  3. Organize datasets and models as follows
NeuRay
|-- data
    |--model
        |-- neuray_gen_cost_volume
        |-- neuray_gen_depth
    |-- dtu_test
    |-- llff_colmap
    |-- nerf_synthetic

Render

# render on lego of the NeRF synthetic dataset
python render.py --cfg configs/gen/neuray_gen_depth.yaml \  
                 --database nerf_synthetic/lego/black_800 \ # nerf_synthetic/lego/black_400
                 --pose_type eval                 

# render on snowman of the DTU dataset
python render.py --cfg configs/gen/neuray_gen_depth.yaml \  
                 --database dtu_test/snowman/black_800 \ # dtu_test/snowman/black_400
                 --pose_type eval 
                 
# render on fern of the LLFF dataset
python render.py --cfg configs/gen/neuray_gen_depth.yaml \
                 --database llff_colmap/fern/high \ # llff_colmap/fern/low
                 --pose_type eval

The rendered images locate in data/render/<database_name>/<renderer_name>-pretrain-eval/. If the pose_type is eval, we also generate ground-truth images in data/render/<database_name>/gt.

Explanation on parameters of render.py.

  • cfg is the path to the renderer config file, which can also be configs/gen/neuray_gen_cost_volume.yaml
  • database is a database name consisting of <dataset_name>/<scene_name>/<scene_setting>.
    • nerf_synthetic/lego/black_800 means the scene "lego" from the "nerf_synthetic" dataset using "black" background and the resolution "800X800".
    • dtu_test/snowman/black_800 means the scene "snowman" from the "dtu_test" dataset using "black" background and the resolution "800X600".
    • llff_colmap/fern/high means the scene "fern" from the "llff_colmap" dataset using "high" resolution (1008X756).
    • We may also use llff_colmlap/fern/low which renders with "low" resolution (504X378)

Evaluation

# psnr/ssim/lpips will be printed on screen
python eval.py --dir_pr data/render/<database_name>/<renderer_name>-pretrain-eval \
               --dir_gt data/render/<database_name>/gt

# example of evaluation on "fern".
# note we should already render images in the "dir_pr".
python eval.py --dir_pr data/render/llff_colmap/fern/high/neuray_gen_depth-pretrain-eval \
               --dir_gt data/render/llff_colmap/fern/high/gt

Render on custom scenes

To render on custom scenes, please refer to this

Generalization model training

Download training sets

  1. Download Google Scanned Objects, RealEstate10K Space Dataset and LLFF released Scenes from IBRNet.
  2. Download colmap depth for forward-facing scenes at here.
  3. Download DTU training images at here.
  4. Download colmap depth for DTU training images at here.

Rename directories and organize datasets like

NeuRay
|-- data
    |-- google_scanned_objects
    |-- real_estate_dataset # RealEstate10k-subset  
    |-- real_iconic_noface
    |-- spaces_dataset
    |-- colmap_forward_cache
    |-- dtu_train
    |-- colmap_dtu_cache

Train generalization model

Train the model with NeuRay initialized from estimated depth of COLMAP.

python run_training.py --cfg configs/train/gen/neuray_gen_depth_train.yaml

Train the model with NeuRay initialized from constructed cost volumes.

python run_training.py --cfg configs/train/gen/neuray_gen_cost_volume_train.yaml

Models will be saved at data/model. On every 10k steps, we will validate the model and images will be saved at data/vis_val/<model_name>-<val_set_name>

Render with trained models

python render.py --cfg configs/gen/neuray_gen_depth_train.yaml \
                 --database llff_colmap/fern/high \
                 --pose_type eval

Scene-specific finetuning

Finetuning

# finetune on lego from the NeRF synthetic dataset
python run_training.py --cfg configs/train/ft/neuray_ft_depth_lego.yaml

# finetune on fern from the LLFF dataset
python run_training.py --cfg configs/train/ft/neuray_ft_depth_fern.yaml

# finetune on birds from the DTU dataset
python run_training.py --cfg configs/train/ft/neuray_ft_depth_birds.yaml

# finetune the model initialized from cost volume
python run_training.py --cfg configs/train/ft/neuray_ft_cv_lego.yaml

The finetuned models will be saved at data/model.

Finetuned models

We provide the finetuned models on the NeRF synthetic datasets at here.

Download the models and organize files like

NeuRay
|-- data
    |-- model
        |-- neuray_ft_lego_pretrain
        |-- neuray_ft_chair_pretrain
        ...

Render with finetuned models

# render on lego of the NeRF synthetic dataset
python render.py --cfg configs/ft/neuray_ft_lego_pretrain.yaml \  
                 --database nerf_synthetic/lego/black_800 \
                 --pose_type eval \
                 --render_type ft

Code explanation

We have provided explanation on variable naming convention in here to make our codes more readable.

Acknowledgements

In this repository, we have used codes or datasets from the following repositories. We thank all the authors for sharing great codes or datasets.

Citation

@inproceedings{liu2022neuray,
  title={Neural Rays for Occlusion-aware Image-based Rendering},
  author={Liu, Yuan and Peng, Sida and Liu, Lingjie and Wang, Qianqian and Wang, Peng and Theobalt, Christian and Zhou, Xiaowei and Wang, Wenping},
  booktitle={CVPR},
  year={2022}
}

neuray's People

Contributors

cwchenwang avatar liuyuan-pal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neuray's Issues

A question about "depth_range"

Hello, I noticed that in your database.py, DTUTrainDatabase's "depth_range" is written with a fixed value, while GoogleScannedObjectDatabase's "depth_range" is calculated based on pose. What's the difference between these two ways? And if I want to run your code on ShapeNet dataset, which way do you recommend to get depth_range?

DTUTrain:
image
GoogleScannedObject:
image

渲染自定义数据集的问题

文档中的“python run_colmap.py --example_name desktop --colmap ”--colmap 需要指向哪?模型及数据的目录结构完全和文档一致

Kindly remind that there may be some minor mistakes in 'configs/' folder

Hi, thanks for your excellent work!
Sorry to bring your attention that in "configs/train/ft/neuray_ft_depth_birds.yaml",
the "database_name" should be "dtu_test/birds/black_800", not "llff_colmap/birds/high".
Also, in "configs/train/ft/neuray_ft_depth_fern.yaml", the "database_name" should be "llff_colmap/fern/high", not "dtu_test/fern/black_800".
Otherwise it will throw a mistake when finetuning on DTU and LLFF datasets.

about depth using in init_net.py

Thank you for your contribution!

I notice that "DepthInitNet" and "CostVolumeInitNet" both use the depth image. Does it mean the model is not end-to-end for the dataset without depth information, because you should get the depth using colmap firstly.

If my dataset is relatively large and there is no depth information or it is time-consuming to calculate depth, how should I use your code to run it?

Thank you very much.

Why are inconsistent parameters set in general model config and fine-tune model config?

(1)
In configs/train/gen/train_gen_cost_volume_train.yaml, use_vis is false.

fine_dist_decoder_cfg:
    use_vis: false

while, in configs/train/ft/neuray_ft_cv_lego.yaml, the use_vis of fine_dist_decoder_cfg is set as the default value of dist_decoder: true.

This will cause the pretrained general model is not successfully loaded in the fine-tune model because of the different setting of vis-encoder in dist-decoder. So what should use_vis of fine_dist_decoder_cfg be ?

(2) I noticed that use_self_hit_prob is only set to true in the fine-tune-model. So why not set it to true consistently in the general-model?

What's the difference of use_src_img and not use_src_img?

Hello. This is a great work. But I meet some problem in understanding this code.

In dataset/train_dataset.py, there are some code I do not understand.

ref_imgs_info, ref_cv_idx, ref_real_idx = build_src_imgs_info_select(database,ref_ids,ref_ids_all,self.cfg['cost_volume_nn_num'])
....
if self.cfg['use_src_imgs']:
    src_imgs_info = ref_imgs_info.copy()
    ref_imgs_info = imgs_info_slice(ref_imgs_info, ref_real_idx)
    ref_imgs_info['nn_ids'] = ref_cv_idx
else:
    # 'nn_ids' used in constructing cost volume (specify source image ids)
    ref_imgs_info['nn_ids'] = ref_idx.astype(np.int64)

What do the src_imgs and ref_imgs mean respectively? Why do you directly assign the copy of ref_imgs_info to src_imgs_info?

And what do ref_cv_idx and ref_real_idx respectively mean?

A question that loss values are all NaN

Have you ever encountered any cases where all loss values are NaN values?
I ran your code on shapeNet dataset, and the training was fine at first, but after a few thousand steps, loss would all be NaN, with a warning that the input tensor might have NaN or Inf.
I wonder if you have any experience in this field?
I implemented shapeNet's dataset class myself, modeled after the other dataset classes in database.py. Among it, the masks are all set to 1, and the depth maps are all set to 0, because I use your Cost Volume model.
The parameter I am not sure about is the background. I set the background to white, while your training and testing seem to use black. Will this matter?

How to get the accuracy presented in Tab. 1?

Thank you so much for releasing the code.

I noticed that only one PSNR/SSIM number is reported in Table 1. However, because there are N scenes on DTU/LLFF, we will get "N" PSNR/SSIM numbers by using the evaluation code.

Do you get the final PSNR in Tab.1 by using "(PSNR_1 + PSNR_2 +, ..., +PSNR_N)/N"?

OR using some other approaches?

Thank you for your attention.

finetuning on custom dataset?

Hi,

I've run the general algorithm on some custom data - it does quite a nice job on my relatively sparse views vs the few other tools I've been trialling - is it possible to run fine-tuning on custom data to further improve the performance? I'm not quite sure I understand how the configs should be adapted.

depth

Hi. In the DTU dataset, I found that there is a big difference between the depth map estimated using colomap (you provided) and the ground truth depth map, have you checked this?

scan1, view 0, 300x400:
ground truth depth map:
image
depth map estimated using colomap:
image

Could you check it? Thanks!!!

A question about color c

Great work, but after reading your paper. I have a question: how is the aggregation of local features fi,j with vi,j achieved to compute alpha values and color c? While I can understand how alpha values are calculated in Section 3.6, I'm curious about the process for computing color c.

what's the usage of "get_diff_feats" function?

Hi Authors,

Thanks for sharing the code of the great work.
Could you please explain a little bit about the usage of "get_diff_feats" function?

  1. why you inverse the near/far plane?
    near_inv, far_inv = -1 / near[..., None], -1 / far[..., None]
  2. why you renorm the input depth?
    depth_in = depth_in * (far_inv - near_inv) + near_inv
    depth = -1 / depth_in
  3. why you unproject the depth to point cloud and then project them back? function "project_points_ref_views"

Thanks

环境问题

感谢作者的分享,我尝试了很多方法,都会报错

(py38) root@23504c294479:/cmdata/docker/yfq/NeuRay# python render.py 
Traceback (most recent call last):
  File "render.py", line 12, in <module>
    from network.renderer import name2network
  File "/cmdata/docker/yfq/NeuRay/network/renderer.py", line 13, in <module>
    from network.init_net import name2init_net, DepthInitNet, CostVolumeInitNet
  File "/cmdata/docker/yfq/NeuRay/network/init_net.py", line 5, in <module>
    from inplace_abn import ABN
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/__init__.py", line 1, in <module>
    from .abn import ABN, InPlaceABN, InPlaceABNSync
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/abn.py", line 8, in <module>
    from .functions import inplace_abn, inplace_abn_sync
  File "/opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/functions.py", line 8, in <module>
    from . import _backend
ImportError: /opt/conda/envs/py38/lib/python3.8/site-packages/inplace_abn/_backend.cpython-38-x86_64-linux-gnu.so: undefined symbol: THPVariableClass
(py38) root@23504c294479:/cmdata/docker/yfq/NeuRay# 

可以帮忙看下么,或者提供完成的python版本,cuda版本,各个三方库的版本;
如果能提供一个可用的docker镜像就再好不过了。

difference of 'pixel_colors_nr' and 'pixel_colors_nr_fine'

Thank you for sharing the source code.
However, I have a question about the difference between 'pixel_colors_nr' and 'pixel_colors_nr_fine,' which appear to represent the rendered color values.
It seems that both are obtained by sampling 64 points and rendering 'c.'
Are the values recorded in 'Table 1' obtained using only one of these methods, or is there a different sampling method used to obtain those values?

Code for training your dataset on baseline models?

Thanks for the excellent work. I noticed that you trained other baselines (e.g. PixelNeRF) on the same dataset as yours, but I found that adapting the extra dataset on pixelnerf is not successful. Could you also provide the code of training baseline models?

自定义数据集问题

我发现你的desktop那个数据就只有图片,我现在可以根据这些图片进行渲染。但是在其他nerf的项目会发现dtu等数据都是带着一个相机参数、世界坐标的一个文件,请问你知道这个文件是怎么生成的么?
是原本就自带的?还是用colmap等工具生成的呢?

渲染自定义数据报错

我按照文档下载了模型和测试数据,并实际测试了 python render.py dtu数据,是能够生成图像的,但是渲染自定义数据发生报错,(数据仍使用你提供的 desktop 数据,目录结构也是按照文档组织的)

python run_colmap.py --example_name desktop --colmap data/example/desktop

Traceback (most recent call last):
  File "run_colmap.py", line 27, in <module>
    process_example_dataset(flags.example_name,flags.same_camera,flags.colmap_path)
  File "/cmdata/docker/yfq/NeuRay/colmap_scripts/process.py", line 38, in process_example_dataset
    db.add_image(img_fn.name, cam_id)
  File "/cmdata/docker/yfq/NeuRay/colmap/database.py", line 181, in add_image
    prior_q[3], prior_t[0], prior_t[1], prior_t[2]))
sqlite3.IntegrityError: UNIQUE constraint failed: images.name

能帮忙看下么?万分感谢!

License

Very impressive results from your paper! And thank you for releasing the code! I was wondering if you could add a LICENSE file to your repo? Thank you!

ImportError: cannot import name 'select_working_views_by_overlap' from 'utils.view_select'

Thank you for your sharing!
I download the nerf_synthetic dataset and your pre-trained model. When I run

python render.py --cfg configs/gen/neuray_gen_depth.yaml --database nerf_synthetic/lego/black_800 --pose_type eval

There is an error:

Traceback (most recent call last):
File "render.py", line 12, in
from network.renderer import name2network
File "/home/hyx/NeuRay/network/renderer.py", line 21, in
from utils.view_select import compute_nearest_camera_indices, select_working_views, select_working_views_by_overlap
ImportError: cannot import name 'select_working_views_by_overlap' from 'utils.view_select'

I found that the function 'select_working_views_by_overlap' is commented. So I uncomment that and run again. But there is another error:

Traceback (most recent call last):
File "render.py", line 210, in
render_video_gen(flags.database_name, cfg_fn=flags.cfg, pose_type=flags.pose_type, pose_fn=flags.pose_fn,
File "render.py", line 113, in render_video_gen
ref_ids_list = select_working_views_db(database, ref_ids_all, que_poses, render_cfg['min_wn'])
File "/home/hyx/NeuRay/utils/view_select.py", line 86, in select_working_views_db
indices = select_working_views(ref_poses, que_poses, work_num, exclude_self)
File "/home/hyx/NeuRay/utils/view_select.py", line 21, in select_working_views
dists = np.linalg.norm(ref_cam_pts[None, :, :] - render_cam_pts[:, None, :], 2, 2) # qn,rfn
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

Could you please help me solve this problem? Thank you for your time!

An error about inplace-abn

Hello, after I configured the environment, I encountered the following error when running your test code.
Do you know what the reason is?
image

Difference between IBRnetwithNeuray model and Neuray model

Hi, thanks for sharing the code for this amazing work!

If I understand correctly, IBRnet didn't consider the visibility in each view while neuray considers it with the help from the depth map. In the implementation, I saw there is a class called IBRNetWithNeuRay

class IBRNetWithNeuRay(nn.Module):
. I wonder in terms of implementation, would this model will have the same performance as the neuray model itself(NeuralRayGenRenderer)?
class NeuralRayGenRenderer(NeuralRayBaseRenderer):

Or, neuray model actually has other designs which make it even better?

Thank you!

Best,
Wenzheng

Question for the rendering

I saw two rendering method in the code.

  1. visibility, hitting probability -> IBRNet -> density, color -> alpha -> hitting probability -> images
  2. visibility -> alpha -> hitting probability -> images

However, in the follow code:
alpha_values, visibility, hit_prob = self.dist_decoder.compute_prob( prj_dict['depth'].squeeze(-1), que_dists.unsqueeze(0), prj_mean, prj_var, prj_vis, prj_aw, True, ref_imgs_info['depth_range'])
we can obtain hitting probability, can we direct render images via hit_prob?
Thanks

Custom scene rendering

Hi

I am trying to render a custom scene

python run_colmap.py --example_name desktop
--colmap # note we need the dense reconstruction

Here what do you mean by "path-to-your-colmap" ?

colmap image_undistorter

After colmap image_undistorter ,inconsistent image size.
(*geometric.bin)the depth image size is not uniform and the code does not work.

finished

学长你好,在这个函数中,我个人理解pose中是存放的相机在世界坐标系下的位置与姿态,那么第一个cam_xyz应该就是相机坐标系下的坐标了吧,为什么还要经历一次从世界坐标系到相机坐标系的转换呢(cam_xyz = rot @ cam_xyz + trans),或者说是我的理解有问题,想请学姐帮我讲解一下

Question for the coordinate?

Are the coordinates used here opencv coordinates, different from the opengl coordinates used by nerf?

When I load the poses of the Nerf Synthetic dataset,why do I need to multiply diag[1, -1, -1]?

The code is in the dataset/database/class NeRFSyntheticDatabase
def parse_info(self,split='train'):
with open(f'{self.root_dir}/transforms_{split}.json','r') as f:
# 加载数据
img_info=json.load(f)
# 焦距
focal=float(img_info['camera_angle_x'])
# 存储images下标和poses
img_ids,poses=[],[]
for frame in img_info['frames']:
img_ids.append('-'.join(frame['file_path'].split('/')[1:]))
pose=np.asarray(frame['transform_matrix'], np.float32)
# 旋转矩阵
R = pose[:3,:3].T
t = -R @ pose[:3,3:]
R = np.diag(np.asarray([1,-1,-1])) @ R
t = np.diag(np.asarray([1,-1,-1])) @ t
poses.append(np.concatenate([R,t],1))
h,w,_=imread(f'{self.root_dir}/{self.img_id2img_path(img_ids[0])}.png').shape
focal = .5 * w / np.tan(.5 * focal)
# 内参
K=np.asarray([[focal,0,w/2],[0,focal,h/2],[0,0,1]],np.float32)
return img_ids, poses, K

DTU testing dataset

Hi,

Thank you for your amazing work. I understand that for DTU, you're evaluating only for 4 scans (birds, tools, bricks and snowman). However, I wanted to evaluate NeuRay for all the scans included in dtu_test_scans.txt. Hence, I'd be grateful if you could share the corresponding processed dataset, if you happen to have it.

Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.