cao-cong / rvidenet Goto Github PK

View Code? Open in Web Editor NEW

186.0 186.0 29.0 13.07 MB

Supervised Raw Video Denoising with a Benchmark Dataset on Dynamic Scenes. CVPR 2020

Python 53.92% Shell 0.13% C++ 9.60% Cuda 32.88% C 3.46%

rvidenet's People

Contributors

Stargazers

Watchers

rvidenet's Issues

GT to train ISP.

Hi Author,

Great work. Could you please answer me two questions?

What is the ground truth data you used to train the ISP from SID data.

In your code train_isp.py, I found gt_paths = glob.glob('./data/ISP_data/SID/Sony/long_isped_png/*.png'), but where do those long_isped_png come from? I did not find png data from SID dataset.

In train_predenoising.py, there is a function named bayer_preserving_augmentation, but I can not find it's definition, could you please check it ?

Thanks!!

Some questions about the dataset

Congratulations on your paper accepted by CVPR!
我提几个关于数据集的问题（问题有点多，我先直接用中文提问了，介意的话我可以直接删除问题并关闭issue）：

感觉你们收集真实视频数据集的部分类似于定格动画？
每段视频帧数较少（因为难度大所以好像总量也不多），而且帧与帧之间的差异有点大，体现在视频上相当于运动速度很大？
真实的视频拍摄时会因为曝光时间产生运动模糊，这是否是这个方法生成的视频序列与真实视频序列最大的差别？
GT使用的是多帧平均，但是低光高ISO时的噪声是非零均值的（参考文献中的SIDD收集时有提到），这导致多帧平均会造成强度偏移，你们是如何处理的？（以及可能存在的对齐的问题）
有一些GT依旧包含明显的噪声，所以你们使用了BM3D，把其他方法得到的图像作为GT是不是不足够合理？
合成数据集部分使用的噪声模型可能有点老，不够准确（比如没考虑横条纹噪声）？
预去噪部分为什么要用SID，SID的GT也是有比较明显的噪声的，如果要RAW图像应该选择FiveK？
结果中包含在混合数据集以及单纯合成数据集上训练的结果，有没有在单纯真实数据集上训练的结果？

what about unprocessed video denoising results

I find that you use unprocessed video + PG noise model to generate enough training video pairs, then tuning with the real video dataset.

Considering the train/dev real scenes is 6, and the unprocess method can synthesis much more dataset, what about the results before tuning?

Thx.

how to train EDVR with CRVD?

您好，问个题外话，如何用CRVD训练EDVR网络，能分享下train的代码吗，不胜感激~T T

CRVD dataset download

Hi, download CRVD from the baidu netdisk is too slow and I can't open the link of the MEGA.

So, could you please release the dataset on Google driver?

Thanks a lot.

I'm confused why you first train the Net with data from MOT Challenge(which is sRGB and converted to raw) and then train the net with you own raw dataset.
BTW, could you please share the training code in details? Thanks!

Provide shot and read noise parameters

Hi,

As it was said in another issue, the .TIFF files do not contain the original image metadata. The RViDeNet article mentions that the shot and read noise parameters can be estimated using bias frames and flat field frames. It is also known in the litterature that these parameters are dependent on the analog and digital gain / ISO settings.
Since we do not have access to theses images and that specific camera, could you provide the shot and read noise parameters for each ISO?

multi gpu to train model

i want to use multi gpu to reduce train time. i see the gpu setting, the default is int. then the count of gpu muss be one. So want to know, by this code, can multi gpu can be used

About training data

Hello，what an amazing work! But I have some problems confused. You pretrained a single-frame based denoising network and Image Signal Processing (ISP) with 230 clean raw images from SID dataset. whether the training sets of two model is the same? I notice that SID dataset has released two datasets about Sony (25 GB) and Fuji (52 GB). Would you tell me which dataset you use or provide the link about the dataset you use? I am very interested in this work. I hope you can release the train codes soon! Thank you very much!

Metadata of original raw images

I find that the .tiff image seems not to contain the metadata of raw images. Can you also provide the metadata files of the training and testing data? or simply the color matrix and white balance parameters?

Test results

HI,
I have trained the network on SRVD for 33 epochs and the test results:
psnr is about 27 and ssim is about 0.7
And the visual results are very more red than ground truth.

I'd like to know if there are problems in my training.
Looking forward to your reply. Thanks!
@cao-cong

License for the repo

Hi there, thanks for sharing the code!

I am trying to use the code and cite it in my paper but I need to get clearance of the License requirement.

If you have time, would you mind updating the License information of your code base?

Thanks a lot!

[REQ] add a (GH-compliant) license file

Bump #15

Hi there, 1st of all thanks for this awesome work !

Since we've "doxed" it in our HyMPS project (under the VIDEO \ AI-based \ Denoisers subsection), can you please add a "GH-compliant" license file ?

Expliciting licensing terms is extremely important to let other devs (and not only) understand how to reuse/adapt/modify your code in other open projects and vice-versa.

Although it may sounds like a minor aspect, license file omission obviously causes an inconsistent generation of the relative badge too:

(generative URL: https://flat.badgen.net/github/license/cao-cong/RViDeNet/?label=LICENSE)

Anyway you can easily set a standardized one through the GH's license wizard tool.

Last but not least, let us know how - in your opinion - we could improve our categorizations and links to resources in order to favor collaboration between developers (and therefore evolution) of listed projects.

Thanks in advance.

Dataset Download Difficulty

Hello 👋 Your dataset looks very interesting but I am having trouble downloading it. The first link ("mega") requires payment and the second link ("baidu") requires me to install extra software. Is there another link to download the dataset I am not finding? Thank you for your help.

no question

shot noise calibrate question

学#2 issue还是感觉说中文会好一点。

在支持材料中描述噪声标注的步骤，我看到图3中的斜率在像素强度为0时都是为负的，虽然实际过程中像素强度总是在black level之上的，我有个问题是如何来判断标定的参数的正确性呀。
当像素强度为black level时，拟合的方差直线的图应该为$2\sigma_r$，想问下这个值和bias frame标注出来的差异大吗？
还有一个问题是图3中的像素强度值有点大呀，我有看到你提的sensor raw的black level是240, white level是2^12-1，这个像素值强度是在white level的范围下的吗？

Thx.

Can I ask for the data of bias frames? I hope to estimate the noise profiles.

rt
I have send email to you. I'm looking forward to your reply!
Thanks!

temporal loss question

I do not know clearly how to construct four frames to calulate temporal loss.
For any given time t, the t-1,t,t+1 sequence frames are fixed. what about the other frame?

Besides, I find you disable temporal loss in the first training stage because it's time consuming. And I do not find any temporal loss advantages in the paper and the Supplementary Material, how does it work?

Thx.

about network

hello, I'm interested in this work, too. The questions are as follows: 1) What does raw domain processing mean in ablation study? If we don't use raw domain processing, what deos the network look like? Is it EDVR? 2) Have you tried other ISP model? I used U-net to demosaick, but the result is not so good, even worse than 8 residual blocks. 3) how do you retrain TO-flow? using 3 or 7 sRGB images as input? I hope to hear from you soon~

parameters analysis question in ablation study

In section 5.2 Ablation Study,

By incorporating the packing strategy in raw denoising, i.e. processing the RGBG sub-sequences separately and merging them in the final stage, the denoising performance is nearly the same as that of the unpacking version. However, the parameters are greatly reduced since we only extract 16 channel features for each sub-sequence and the unpacking version extract 64 channel features.

I am confused with the parameters are greatly reduced , how does it greatly reduced?

Thx.

cao-cong / rvidenet Goto Github PK

rvidenet's People

Contributors

Stargazers

Watchers

Forkers

rvidenet's Issues

Recommend Projects

Recommend Topics

Recommend Org