Giter Club home page Giter Club logo

san's People

Contributors

daitao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

san's Issues

几个有关网络结构的问题

1、看上去SOCA是一个很厉害的加强版CA,能否做一个将RCAN中所有CA换成SOCA的实验,这样可以证明SOCA比CA更强?
2、大组(SSRG)和小组(LSRAG)都是堆叠,但是结构略有区别,能否解释下为什么这样设计(为什么不同)吗?
(1)大组结尾有一个3x3卷积,而小组没有;
(2)小组结尾是SOCA,但是大组开头和结尾是RL-NL;
(3)大组用了对开头做残差,但是小组没有。
3、SSRG中的残差gamma,为什么是共享的同一个,而不是每一个小组用一个不同的gamma?
4、为什么SSRG前后的non-local是共享同一个,而不是用两个不同的?

AssertionError: Invalid device id

Hi Everyone,
I am getting an error AssertionError: Invalid device id, when i set --n_GPUs to 2 (args.n_GPUs to 2). can anyone help me.
I am getting error at this line(Line number 29) in model/init.py
self.model = nn.DataParallel(self.model, range(args.n_GPUs))
I tried nvidia-smi , it is showing 2 gpus also i tried,
torch.cuda.device_count() it shows 2

Pretrained Model on a different location

Hi,
Congratulations on getting some great results with SAN. I was wondering if it's possible for you to put the pretrained models somewhere else than Baidu network. I wanted to try out the model on some of my personal benchmarks but seems like Baidu bloatware won't let me do the same in Germany. Can you perhaps put the model on Google Drive or Model Zoo or Git Large File Storage?
It would be much appreciated given that the model is pretty big and expensive to train from scratch.

Pytorch version

Can someone who has successfully implemented, can please share the pytorch version

undefined variable 'der_sacleTrace'

SAN\TrainCode\model\MPNCOV\python\MPNCOV.py 84 line der_sacleTrace
I can not find the location of this variable definition. I also don't understand what this variable mean. Can the author explain this? Thank you.

How to train?

RuntimeError: CUDA out of memory. Tried to allocate 5.05 GiB (GPU 0; 15.75 GiB total capacity; 10.42 GiB already allocated; 3.68 GiB free; 50.61 MiB cached)

Training on a V100 gpu under the same setting as training demo (--n_resgroups 20 --n_resblocks 10)

Please kindly let me know how to deal with it if this is possible.

How to reproduce the effect in the paper

Hi,Dai Tao.
I ran the code in TrainSAN_script.sh and readme(both for scale=4) to train the model,but it just achieved a PSNR about 31.67 on SET5(it dosen't rise after about 700epoch). What's wrong with that, or what did I missed?
Thanks.

RuntimeError: CUDNN_STATUS_MAPPING_ERROR

Traceback (most recent call last):
File "main.py", line 19, in
t.train()
File "/opt/data/private/SAN-master/TrainCode/trainer.py", line 51, in train
sr = self.model(lr, idx_scale)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/opt/data/private/SAN-master/TrainCode/model/init.py", line 58, in forward
return self.model(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/opt/data/private/SAN-master/TrainCode/model/san.py", line 515, in forward
x = self.sub_mean(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDNN_STATUS_MAPPING_ERROR

How can I solve it?

Attempting to Upscale 720p images, CUDA Out of Memory

Hello,

Thank you so much for the excellent contribution! I am attempting to upscale a standard 720p image using SAN. I am using the 3x model, and have tried the following GPUs

  1. 2060
  2. 2x 2080Tis
  3. Tesla V100

And the model is OOM (out of memory) in all of them. I'm actually at a loss, how should I proceed with regards to the same in order to run inference?

Thank you.

Apply on grayscale image

Hi, I've read your paper and code and found it very interesting.
One question, if SAN uses channel-wise feature extraction, is it not effective on grayscale images?
I tried to test it myself, but simply changing the code in option.py didn't work. If this model is also effective on grayscale images, can you tell me how to change the code/settings for training? If not, its fine :)

Just a quick question

Really good job!
Just a quick question, did you compare your results with WDSR and ESRGAN for both PSNR and perception loss?

Thanks!

Memory requirement.

Impossible to run this model on single 1080ti GPU under the default setting in the code

CUDA out of memory (default setting)

"RuntimeError: CUDA out of memory. Tried to allocate 324.00 MiB (GPU 0; 10.73 GiB total capacity; 9.08 GiB already allocated; 290.31 MiB free; 612.32 MiB cached)"

Training on a 2080ti gpu under the same setting as training demo (--n_resgroups 20 --n_resblocks 10)

Please kindly let me know how to deal with it if this is possible.

difference in the TrainCode and TestCode

Hello, thank you for your wonderful work .
As a beginner, I find that the TrainCode and TestCode are mostly similar, and at first glance they look the same. Could you please tell me the main difference?

关于BD退化模型

您好!请问BDx3的模型的训练权重能提供一下吗?我需要用于定性比较

Cuda Out of memory error

I modified the MPNCOV.py based on this solution #29, but I still get out of memory error.
RuntimeError: CUDA out of memory. Tried to allocate 9.80 GiB (GPU 0; 14.76 GiB total capacity; 9.94 GiB already allocated; 3.89 GiB free; 160.79 MiB cached).
Is there any way to solve this issue?

Need Benchmark Dataset for training.

Hello,
I follow the readme file's step and set the DIV2K dataset and set '--dir_data' as the HR and LR image path.
But when I try to train the model , there seems to be a missing dataset dir_data/benchmark/Set5/
please tell me how should the benchmark folder and the Set5 looks like,
Thanks

Paper error!! (RCAN's parameter is 15.44M, which is less than SAN 15.7M)!!!

Hi, @daitao
Sorry to disturb, I think you have made an error in your SAN paper.
RCAN have less parameters than SAN that RCAN contains 15.44M and SAN contains 15.7M.
(This value is reported by RCAN's author)
Because of your wrong conclusion, now all the work has made the same mistakes as you, which is completely deviated from the development track.
image

Set5 X2代码运行结果跟论文出入较大

您好 我在论文复现的时候出现了一些问题,首先,是学习率的设置 您论文中说的是每200次衰减一半,然后代码设置里是说的每50次衰减为0.6,按照代码直接跑的结果是37.7多,改完学习率以后跑的结果是37.9多。
随后模型中LSRAG中soca后面是没有conv层的 然后您加了conv层 我去掉跑的话第一代结果为8.多,然后就直接关掉了 请问您加这一层的作用是什么?
最后 我又添加了SSRG中最后一层的conv,目前跑到1200代左右,效果为37.9,还是离论文结果有一定的差距。请问是我哪里的参数设置错了还是模型要改动哪里 非常希望能得到您的回复

Mini-batch size - 8 or 16?

Hi,

In the paper, it is stated that 8 LR colour patches of size 48x48 are used for training. However, in the default settings, the mini-batch size is 16. What settings need to be used to match the results in the paper?

When I reduced the batch size to 10 due to memory limitations in the GPU, the PSNR of Set5x2 was about 37.8dB and stopped improving after 590 epochs. This deviates from the results in the paper.

I am aware that a few questions have been asked about the batch size, but I couldn't find an answer to it. Does anyone have any information about this? Thanks.

reduction=8 or 16

Hi, daitao, thanks for your wonder work about SAN. It gives us many ideas.

I have a problem about reduction value in SAN code.
In option.py, it is set to be 16. However , (in san.py) when we use Nonlocal_CA(about 500 lines), it is set to be 8.

Is it set to be all 16 in code as paper said ?

Best regards.

ZeroDivisionError: division by zero

When i run the code to test, got the this error.

./SAN/TestCode/code/trainer.py, line 113 in test

self.ckp.log[-1, idx_scale] = eval_acc / len(self.loader_test)
ZeroDivisionError: division by zero.

I have no idea why len(self.loader_test) is zero.

How can i fix this error?

缺少benchmark

您好,我在训练模型时出现错误,FileNotFoundError: [WinError 3] 系统找不到指定的路径。: 'dataset\benchmark\Set5\HR'
在您给出的训练方法中只有这两步骤:
Download the DIV2K dataset (900 HR images) from the link DIV2K.
Set '--dir_data' as the HR and LR image path.
我已经按照要求下载了DIV2K数据集并且设置了dir_data,但是这里仍然缺少一个benchmark,请问benchmark这些数据在哪里下载呢?

License ?

Hi,

Do you plan to set an open-source license to your project, such as MIT or Apache-2.0 ?
It would make it usable in other open-source projects (which is generally the case when we put some code on github). In this case, I would like to test it and reference to it in my own project focusing on open source image restoration: https://github.com/titsitits/open-image-restoration

Best regards,
Mickaël Tits

training time, GPUs

Hi, thanks for your wonderful work and opening source.
Could you please tell me how long did you train the model , the kind of GPU and number of GPUs to be used ?

Best regards

Validation set?

Hello,

I've been searching the codebase for a validation routine, and it seems the current framework is not using a separate validation set. Are you using the test set as the validation set?

Thanks,
Kwang

There are some difference between code and paper

Hi, thanks for your work.

  1. I can not find any information about 'LSRAG' in the code of ‘san.py’.

  2. In the code, I find the 'SOCA' is not at the tail of NLRG and there is still one Conv layer following 'SOCA'. Additionally, san consists of several NLRGs(n_resgroups).
    In the paper, SAN just has one NLRG,which consists of several LSRAG. So I think the 'NLRG' in code is actually the 'LSRAG' in paper. Is it right?
    And if it is right, why is the SOCA followed by a Conv layer? In paper,the SOCA is at the tail of LSRAG.

I want to know the reason about the difference.Looking forward to your reply.

Any other way to download pre-trained model?

I cannot download the pre-trained weights because it is placed in baidu.
Baidu require Chinese telephone numbers, so foreigners cannot try to SAN.

Will you plan to place pre-trained model in other place?

Why is psnr declining on my own data?

I train SAN on my own data. The loss is droping . But eval psnr is droping, too.
What is the problem?

[gaoqing_592 x2] PSNR: 25.203 (Best: 29.220 @epoch 1)
Total time: 765.24s

axis shape: (10,)
axis: [ 1 2 3 4 5 6 7 8 9 10]
self.log[:, i].numpy() shape: (10,)
self.log[:, i].numpy(): [11.964835 10.344067 9.891278 9.340872 9.287792 8.970273 8.765504
8.653118 8.545278 8.612093]

image

Strange settings

image
image
gmma=0 and gmma*residual ,Does it mean that LSRG does not add up to residual? Why do you do this?
Look forward to your reply

ImportError: cannot import name '_update_worker_pids'

Specs:
OS: Ubuntu 18.04
PyTorch: 1.3.1
Python: 3.6.9

Command:
python3 main.py --model san --data_test MyImage --save save_name --scale 4 --n_resgroups 20 --n_resblocks 10 --n_feats 64 --reset --chop --save_results --test_only --testpath 'your path' --testset Set5 --pre_train ../model/SAN_BIX4.pt

Output:

  File "main.py", line 4, in <module>
    import data
  File "<some_directory_containing_SAN>/SAN/TestCode/code/data/__init__.py", line 3, in <module>
    from dataloader import MSDataLoader
  File "<some_directory_containing_SAN>/SAN/TestCode/code/dataloader.py", line 10, in <module>
    from torch._C import _set_worker_signal_handlers, _update_worker_pids, \
ImportError: cannot import name '_update_worker_pids'

Problem
Upon running the specified command, the provided output is returned. I'm confused as to how everyone else has been able to run the network.
Any help would be sincerely appreciated.

Pretrained Models on Google Drive

Hi, guys.

The inaccessibility of pre-trained models for SAN is quite a problem. Thankfully, I have a friend of mine in China who was kind enough to download the zip file for me and send it. I have uploaded both the zip file and the extracted .pt files to Google Drive.

Feel free to download the models.

Regards

code error

SAN\TrainCode\model\MPNCOV\python\MPNCOV.py 84 line der_sacleTrace

Memory consumption

I am observing that 24th line operation in MPNCOV.py file under the Testcode is consuming a huge amount of memory to create the tensors and to run the addition between them. For 250*100 image size, it is consuming almost 6.98 GB memory. Is there any way to reduce the memory consumption?

SR result low PSNR

First of all thanks for your working, its so helful for me.
I trained the network with batch size 8. But test result is bad. PSNR value for BSD100 test set is 24 dB. The result SR image shoen in below.
101087

What could i have done wrong?
Thanks for your answer in advance.

Why CUDA out of memory

Why CUDA out of memory .Tried to allocate 8.38GiB (GPU 0; 10.92 GiB total capacity; 8.69 GiB already allocated; 1.22GiB free;33.00 MiB cached)

How to evaluate SAN+?

Thanks for your great work. From the paper I found that SAN+ improved performance by a large margin but I don't know how to evaluate that. Hopefully you can share the tips/. Thanks a lot.

Cannot access dataset

Hi,

I'm unable to access Baidu from my country(Brazil). For some reason, when trying to download the pretrained model, it provides me with a Linux client for Baidu's network, but not a direct link to the actual file.

Could you guys upload this file elsewhere, such as Google Drive?
Or is there any other way I can get it?

Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.