m-tassano / fastdvdnet Goto Github PK
View Code? Open in Web Editor NEWFastDVDnet: A Very Fast Deep Video Denoising algorithm
License: MIT License
FastDVDnet: A Very Fast Deep Video Denoising algorithm
License: MIT License
Congratulations on your paper, really nice and well-explained work! I am implementing some modifications, namely, I am adding extra Poisson noise on the videos. You mention that the method can be extended to Poisson, but did you perform any such experiments of your own? Had you identified potential pitfalls?
Moreover, I am adjusting your method to work independently of future frames. That said, I am employing frames t-3, t-2, t-1 and t in 2 triplet combinations [(t-2, t-1, t), (t-3, t-1, t)] utilizing two blocks instead of three. You have performed a relevant ablation study (one block) which was explanatory, thanks!
Hi mr Tassano,
I'm getting an error if I use --gray option and I input grayscale image ( same error also with RGB image):
Traceback (most recent call last):
File "test_fastdvdnet.py", line 166, in
test_fastdvdnet(**vars(argspar))
File "test_fastdvdnet.py", line 110, in test_fastdvdnet
model_temporal=model_temp)
File "D:\AI\fastdvdnet\fastdvdnet.py", line 47, in denoise_seq_fastdvdnet
numframes, C, H, W = seq.shape
ValueError: not enough values to unpack (expected 4, got 3)
Hi, thanks for your excellent work. And could you please offer us the training data and test data. I think that it would be beneficial for the development of video deblur since FastDVDnet is the SOTA method in this domain.
In test_fastdvdnet.py , please change
state_temp_dict = torch.load(args['model_file'])
to
state_temp_dict = torch.load(args['model_file'], map_location=device)
If I understand correctly, you would want to use an older version of nvidia dali since nvidia.dali.ops.CropCastPermute(**kwargs)
which fastdvdnet appears to depend on has been deprecated on newer versions.
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/cuda/10.0 nvidia-dali==0.10.0
Thank you for the excellent work Tassano & fellow authors.
Unfortunately, the link to download train/test datasets are producing a 404 on dropbox. https://www.dropbox.com/sh/m9mpz1m1b55x420/AAAt1wes43brv37BmBxw07jna?dl=0
Could you kindly let us know where we can download the .mp4 files
Thanks
Daran
I'm assuming the response to this will be something along the lines: "turn off GPU." Yet, I find it hard to believe that using the same hardware I've used with DaVinci Resolve (which can denoise an unlimited amount of video frames on the fly using motion estimation), my measly 80kb jpg files (sure 10717 frames worth) are failing with fastDVDnet. Well:
Traceback (most recent call last): File "test_fastdvdnet.py", line 166, in test_fastdvdnet(**vars(argspar)) File "test_fastdvdnet.py", line 98, in test_fastdvdnet max_num_fr=args['max_num_fr_per_seq']) File "C:\Users\pedro\Desktop\CLONE\fastdvdnet\utils.py", line 127, in open_sequence expand_axis0=False) File "C:\Users\pedro\Desktop\CLONE\fastdvdnet\utils.py", line 185, in open_image img = normalize(img) File "C:\Users\pedro\Desktop\CLONE\fastdvdnet\utils.py", line 307, in normalize return np.float32(data/255.) numpy.core._exceptions.MemoryError: Unable to allocate 23.7 MiB for an array with shape (3, 1080, 1920) and data type float32
This is utilizing 8GB VRAM. I guess I'm confused about how the program handles the memory allocation? It's all way over my head...
And to be clear, I'm avoiding solutions like Resolve. I'd prefer to use open-source, especially since noise reduction benefits from tinkering. I also like that you all have made this project otherwise straightforward to work with.
Hi,
I would like to do some test, is it available to give a link to download your test set (Set8) ?
Looking forward to your help.
Best regards.
My English is poor , but i try my best to describe my trouble. I had a test sequence which has 3000 frames and i store the image sequences in test folder. When i set the parameter “max_num_fr_per_seq” = 50 , only 50 frames were processed. How to deal the whole images sequence ?
Hi, I am trying to train your model with Davis dataset, as you konw, the origin datas are jpeg format,
but the model need mp4 format, so I use ffmpeg to transform the jpeg datasets to mp4 dataset, but the training result is not as good as the result in your paper. For example, when the sigma is 10, the training result testing on snowboard sequences is 29.0786dB, which is not as good as your paper(36.5dB).Could you tell me the details you used to transform the datasets ?Or maybe could you offer your mp4 datasets? I'm looking forward to getting your answer. Best regards.
Hi, thanks for your work and the pre-trained model. Two questions not really related to the repo itself.
Regarding architecture, have you experimented with more cascading levels/steps (say, 3-steps denoising so information from a longer sequence might be exploited) or more number of frames for each denoising block (also lead to usage of longer sequence)? Do you observe/expect any improvement by leveraging more frames?
For image denoisers, a mismatch between input noisy image and noise map (say, actual input noise level 25 but input noise map 50) could lead to catastrophic failure of denoising networks. Do you observe similar phenomenon? What's the target noise range of the pretrained model provided?
Thanks:P
Are there more detailed training and testing tutorials?
How do you obtain the ST-RRED scores?
The scikit-video implementation returns 3 values - (strred_array, strred, strredssn)
. On DAVIS, with sigma=10, all three values for denoised videos have a mean less than 0.1.
The PSNR for the same set of videos is 38.94.
The sum over strred_array
on each video gives an average of 5.0790
over all videos.
Hi, I tried training your model on a Nvidia GTX 1050 GPU, but it takes around 5-6 days to complete the training (which seemed too long for the size of the dataset and model). It was trained on the DAVIS-train-dataset provided as a link in the README. Hence I wanted to know the estimated training time and the hardware on which it was trained.
Also, I used DALI is 0.10.0 as mentioned in the README. Could a newer version of DALI improve the speed in any way?
Thanks for your help!
Thanks for your great work. Could fastdvdnet process camera frames@30fps if any possible?
Can this model denoise the video taken by the camera in real time?
Or is it only possible to denoise a full video?
could you please provide supplementary materials paper?
Hi, MatiasI'm going to do my research in your FastDVDNet. I try to use other datasets together with the DAVIS-training-mp4 for training. However, when i generate my mp4 files and used them together with DAVIS-training-mp4 for training, error occured 'Assert on "codec_id_ == codec_id" failed: File MP4/train_little/bear.mp4 is not the same codec as previous files Stacktrace', the reason must be different codec used in my mp4 files with DAVIS-training-mp4. So i want to konw the codec used in generating DAVIS-training-mp4, or could you give me the codes for preprocessing the DAVIS-training-mp4 in your github from jpg files. Sincerely hope to receive your reply! Thanks very much
Hello.
while testing my own images I am getting this issue. Please can you give me a solution.
Parameters:
model_file: ./model.pth
test_path: /content/fastdvdnet/img/upload_image
suffix:
max_num_fr_per_seq: 25
noise_sigma: 0.0392156862745098
dont_save_results: False
save_noisy: True
no_gpu: True
save_path: /content/fastdvdnet/results
gray: False
cuda: False
Loading models ...
Open sequence in folder: /content/fastdvdnet/img/upload_image
Traceback (most recent call last):
File "/content/fastdvdnet/test_fastdvdnet.py", line 166, in
test_fastdvdnet(**vars(argspar))
File "/content/fastdvdnet/test_fastdvdnet.py", line 110, in test_fastdvdnet
model_temporal=model_temp)
File "/content/fastdvdnet/fastdvdnet.py", line 61, in denoise_seq_fastdvdnet
inframes.append(seq[relidx])
IndexError: index 2 is out of bounds for dimension 0 with size 1
Could you provide the evaluation script of ST-RRED , I want to compare my method with fastdvdnet using ST-RRED. Thanks.
Hi, I go to the link you put in dataloader.py but it has been gone and I go to the NVIDIA/DALI and find no folder called blob. The most strange thing is that I use the command of the official downloading: pip install --extra-index-url https://developer.download.nvidia.com/compute/redist --upgrade nvidia-dali-cuda110. AND it showed mistakes: ERROR: Command errored out with exit status 1:
command: 'c:\users\win10\desktop\unet-master\venv\scripts\python.exe' -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Win10\AppData\Local\Temp\pip-install-vgw
say0a\nvidia-dali-cuda110_228016c1766e4d48a3d8bad2f5327d12\setup.py'"'"'; file='"'"'C:\Users\Win10\AppData\Local\Temp\pip-install-vgwsay0a\nvidia-dali-cuda110_228016c1766e4d48a3d8bad
2f5327d12\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replac
e('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\Win10\AppData\Local\Temp\pip-pip-egg-info-i_6pf4i0'
cwd: C:\Users\Win10\AppData\Local\Temp\pip-install-vgwsay0a\nvidia-dali-cuda110_228016c1766e4d48a3d8bad2f5327d12
Complete output (18 lines):
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Win10\AppData\Local\Temp\pip-install-vgwsay0a\nvidia-dali-cuda110_228016c1766e4d48a3d8bad2f5327d12\setup.py", line 150, in
raise RuntimeError(open("ERROR.txt", "r").read())
RuntimeError:
###########################################################################################
The package you are trying to install is only a placeholder project on PyPI.org repository.
This package is hosted on NVIDIA Python Package Index.
This package can be installed as:
```
$ pip install nvidia-pyindex
$ pip install nvidia-dali-cuda110
```
Please refer to NVIDIA DALI installation guide for instructions:
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/installation.html
###########################################################################################
----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/ce/62/2996030c15c9f20f8b771dee32a57f3d847126f121cacd459ec87cd269c6/nvidia-dali-cuda110-0.0.1.dev5.tar.gz#sha256=c1243a4c9f1b929a99d25c2
c58dddf84a7a4725ca8321356839f1cabd8249d4a (from https://pypi.org/simple/nvidia-dali-cuda110/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command out
put.
Downloading nvidia-dali-cuda110-0.0.1.dev4.tar.gz (3.9 kB)
ERROR: Command errored out with exit status 1:
command: 'c:\users\win10\desktop\unet-master\venv\scripts\python.exe' -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Win10\AppData\Local\Temp\pip-install-vgw
say0a\nvidia-dali-cuda110_ff129589575b4d83b927634a39cebb45\setup.py'"'"'; file='"'"'C:\Users\Win10\AppData\Local\Temp\pip-install-vgwsay0a\nvidia-dali-cuda110_ff129589575b4d83b927634
a39cebb45\setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replac
e('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\Win10\AppData\Local\Temp\pip-pip-egg-info-0pmjmqes'
cwd: C:\Users\Win10\AppData\Local\Temp\pip-install-vgwsay0a\nvidia-dali-cuda110_ff129589575b4d83b927634a39cebb45
Complete output (18 lines):
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Win10\AppData\Local\Temp\pip-install-vgwsay0a\nvidia-dali-cuda110_ff129589575b4d83b927634a39cebb45\setup.py", line 150, in
raise RuntimeError(open("ERROR.txt", "r").read())
RuntimeError:
###########################################################################################
The package you are trying to install is only a placeholder project on PyPI.org repository.
This package is hosted on NVIDIA Python Package Index.
This package can be installed as:
```
$ pip install nvidia-pyindex
$ pip install nvidia-dali-cuda110
```
Please refer to NVIDIA DALI installation guide for instructions:
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/installation.html
###########################################################################################
----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/02/f8/e96d0abe8a355d08a7c476a1cada6bc2e71f78a47f329a72b293cf96b1d0/nvidia-dali-cuda110-0.0.1.dev4.tar.gz#sha256=e636072bf82ab5c514c49b9
52b8b0f6c600fc0daac7e41cb49c208b30b43f402 (from https://pypi.org/simple/nvidia-dali-cuda110/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command out
put.
ERROR: Could not find a version that satisfies the requirement nvidia-dali-cuda110 (from versions: 0.0.1.dev4, 0.0.1.dev5)
ERROR: No matching distribution found for nvidia-dali-cuda110
good job! Are there any pretrained models for test? tks~
why did not the training process consider the case of noise mean not equal zero
Hi,Matias~I have a question about the using of nvidia to ask you for help.Although I have install a corresonding version of nvidia-dali,when I trained the model, AttributeError: module 'nvidia.dali.ops' has no attribute 'CropCastPermute' always occurs.I have no idea about how to solve it after I tried to install all kinds of vesions of nvidia-sali.It appears to be torch model without permute function.
dear author, thank you for your nice work, can you tell me how to denoise my own real images?
When I run the training codes, I meet some problems. Could you help me fix them
/root/anaconda3/envs/fastDVDnet/lib/python3.6/site-packages/nvidia/dali/ops.py:627: DeprecationWarning: WARNING: video_reader
is now deprecated. Use readers.video
instead.
In DALI 1.0 all readers were moved into a dedicated :mod:~nvidia.dali.fn.readers
submodule and renamed to follow a common pattern. This is a placeholder operator with identical
functionality to allow for backward compatibility.
op_instances.append(_OperatorInstance(input_set, self, **kwargs))
/root/anaconda3/envs/fastDVDnet/lib/python3.6/site-packages/nvidia/dali/ops.py:627: DeprecationWarning: WARNING: uniform
is now deprecated. Use random.uniform
instead.
op_instances.append(_OperatorInstance(input_set, self, **kwargs))
/root/anaconda3/envs/fastDVDnet/lib/python3.6/site-packages/nvidia/dali/ops.py:627: DeprecationWarning: WARNING: uniform
is now deprecated. Use random.uniform
instead.
op_instances.append(_OperatorInstance(input_set, self, **kwargs))
dlopen "libnvcuvid.so" failed!
Traceback (most recent call last):
File "/home/data0/cailijing/train_fastdvdnet.py", line 214, in
main(**vars(argspar))
File "/home/data0/cailijing/train_fastdvdnet.py", line 40, in main
temp_stride=3)
File "/home/data0/cailijing/dataloaders.py", line 103, in init
self.pipeline.build()
File "/root/anaconda3/envs/fastDVDnet/lib/python3.6/site-packages/nvidia/dali/pipeline.py", line 660, in build
self._pipe.Build(self._names_and_devices)
RuntimeError: Critical error when building pipeline:
Error when constructing operator: VideoReader encountered:
[/opt/dali/dali/operators/reader/loader/video_loader.h:183] Assert on "ret" failed: Failed to load libnvcuvid.so, needed by the VideoReader operator. If you are running in a Docker container, please refer to https://github.com/NVIDIA/nvidia-docker/wiki/Usage
Thank you for your awesome code!
I am hoping you might open-source the log files you have from training. Maybe the training and validation loss as a function of epoch (and/or batch) with an estimate of the runtime?
I'm going to reproduce your results.
There are two questions in the process.
And I used ffmpeg4 to extract images from the video.
ffmpeg -i input.mp4 out%d.png
But your weight file("model.pth") shows better results than your paper.
Is there any wrong in my implementation?
The result is this.
-Our set8 with sigma 50
(Yours - 30.36 / our best 29.96 )
So... Can you help me with how to reproduce your result?
Sorry to bother you again!
I found two DAVIStestset named "Test-Dev 2017" and "Test-Challenge 2017". They two contians totally different sequences. Which one did you use? And the training set seems a bit different between"semi-supervised 2017 Traineval" and "unsupervised 2017 Traineval", which one did you use? I'd like to compare the performance of my work with yours showed in the paper on the same dataset.
How did you handle the resolution such as 910 × 480 or 1130 × 480?
Thank you so much!
I feel confused about "384000 training samples" in your paper. There are 90 sequnences in DAVIS dataset, and which of them did you use?
Hi, nice work. I just run your code and I come to the case that says 'Intel MKL ERROR: Parameter 4 was incorrect on entry to SLASCL.' and the loss turns to inf. This case appears sometimes, but not always. I guess this is somewhat related to the weight orthogonalization. Have you encountered this issue?
Hi.Thanks for this great repo could you please send me a code snippet to denoise just one image?
complete error:
File "train_fastdvdnet.py", line 277, in main(**vars(argspar))
File "train_fastdvdnet.py", line 57, in main temp_stride=3)
File "/home/mkhan/generic_loss/fastdvdnet/dataloaders.py", line 118, in init self.pipeline.build() File "/opt/conda/lib/python3.6/site-packages/nvidia/dali/pipeline.py", line 660, in build self._pipe.Build(self._names_and_devices) RuntimeError: nvml error: 13 Local version of NVML doesn't implement this function
I have tried it with cuda10.0 and cuda11.0 both with various versions of nvidia-dali.
could you sent me a training set 、val set and test set to my mailbox:[email protected]
Thank you so much. I want to keep the same with your trainset video.
I have a set of images which is a number of frames of video. How to read it and work on it? Thanks I have script and code in Matlab, but I am not expert in python so want to learn it.
I have tried my best to convert images to videos. But the resulted videos plus the st-rred computation code you have given in issue gives a lower strred. Can you share your code about how to conver result imges to videos and then compute st-rred?
Hi,
I am trying to reproduce results from your paper and I have noticed that there are some differences between hyperparameters described in the paper and these in the official code:
number of patches (384000/256000),
batch size (96/64),
maximum noise value during training (50/55),
scaling for augmentation (it's mentioned in the paper, but I think it isn't implemented in the current version of the code, at least I cannot find it).
Do you know about any other differences? Could you share how you implemented scaling for the augmentation of the training data?
Also I have made some experiments with different version of your repository and it seems to me that version with old DALI dataloader (v.0.1) gives significantly better results (on validation it was 0.5 dB).
Is there a colab version for easy use of this net?
Hi @m-tassano ,
Recently I tried to reproduce your work and found that the denoise effect performed well visually. But the PSNR value is not satisfactory. Here are some questions about PSNR. I will be very grateful to you if you can give me some advice.
Thank you for reading here patiently. Wish you have a good day :)
Hi, Thank you for your efforts.
but I have question.
In your codes:
seqn = seq + noise
your input are clean data and noise which generate by (" noise = torch.empty_like(seq).normal_(mean=0, std=args['noise_sigma']).to(device)")
But common scenarios ,I got some images with noise. i dont not know the noise distribution.
how can remove image noise and get the clean images.
just like a sequences of these images
In your paper, you compared result with other methods. I wonder how you compute video ST-RRED use scikit-video library. skvideo.measure.strred(referenceVideoData, distortedVideoData), the channel of referenceVideoData should be 1 in this mathod, but you get a 3 channel video. Did you convert rgb to one channel luminance using following formula or other methods?
luminance = 0.2126 * R+ 0.7152 * G + 0.0722 * B
`### Testing FastDVDnet model ###
Parameters:
model_file: ./model.pth
test_path: 0.jpg
suffix:
max_num_fr_per_seq: 25
noise_sigma: 0.11764705882352941
dont_save_results: False
save_noisy: False
no_gpu: False
save_path: a.jpg
gray: False
cuda: True
Loading models ...
Open sequence in folder: 0.jpg
Traceback (most recent call last):
File "test_fastdvdnet.py", line 166, in
test_fastdvdnet(**vars(argspar))
File "test_fastdvdnet.py", line 98, in test_fastdvdnet
max_num_fr=args['max_num_fr_per_seq'])
File "/home/joe/Desktop/AODNET/fastdvdnet-master/utils.py", line 130, in open_sequence
return seq, expanded_h, expanded_w
UnboundLocalError: local variable 'seq' referenced before assignment`
The function strred in sk-image only supports the one with grayscale? Should I first convert such RGB image into gray domain? Thanks a lot.
hi! I have a question about the vision of mkl. In the process of activate the project, having a error“Intel MKL ERROR: Parameter 4 was incorrect on entry to SLASCL.” I have known the vision of my enviroment ---mkl-fft 1.2.0 mkl-random 1.1.1mklservice2.3.0
I want to know why it produce this error? I have find the solution. Meanwhile there is a solution:change the vision of mlk. I don't know this solution is right or not,and I don't know how to change the vision of mlk,can you give me some advice of the vision,how to change it ? I will be very glad to recieve your letter.Thanks!
If the max_num_fr_per_seq is large (> 500), the process of testing is slowing down on the open_sequence method.
By removing the np.stack function from the loop, it speeds up the opening image process:
Fix (utils.py line 129):
print("\tOpen sequence in folder: ", seq_dir)
for fpath in files[0:max_num_fr]:
img, expanded_h, expanded_w = open_image(fpath,\
gray_mode=gray_mode,\
expand_if_needed=expand_if_needed,\
expand_axis0=False)
seq_list.append(img)
seq = np.stack(seq_list, axis=0) ### --> np.stack removed from the loop for the same result.
return seq, expanded_h, expanded_w
Hi,
I just run your test file with uploaded weights and got sigma10: 39.24dB (38.71dB in your paper), and sigma50: 32.03dB(31.86dB in your paper). I've checked the sequences which were downloaded form 'https://davischallenge.org/davis2017/code.html', and I think they are the sames as your test sequences mentioned in the paper(from 'aerobatics' to 'tractor' ). Do you have any idea why I got higher performance (about 0.53dB higher which is quite abnormal.)?
Look forward to your reply.
It seems that during validation, only one patch is loaded (i.e. the test batch size is 1), which limits the code to be run on multiple GPUs. Could you provide the multi-GPU version of the code so that the training and testing can be scaled up?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.