ssl92 / hyperiqa Goto Github PK
View Code? Open in Web Editor NEWSource code for the CVPR'20 paper "Blindly Assess Image Quality in the Wild Guided by A Self-Adaptive Hyper Network"
License: MIT License
Source code for the CVPR'20 paper "Blindly Assess Image Quality in the Wild Guided by A Self-Adaptive Hyper Network"
License: MIT License
It seems to be a really attractive work for IQA, can't wait for the codes ~
With respect to the file "HyerIQASolver.py"
In line 20 you store the parameter of the HyperNetwork to be trained in the variable "self.hypernet_params".
self.hypernet_params = filter(lambda p: id(p) not in backbone_params, self.model_hyper.parameters())
From line 77 to 84 you update the optimizer with this variable self.hypernet_params
Since self.model_hyper.parameters()
returns an iterator, self.hypernet_params
is an iterator and after the optimizer is initialized, it becomes empty. Therefore, after the first epoch, the optimizer doesn't optimize the HyperNetwork anymore because self.hypernet_params is empty
.
Setting self.hypernet_params
as a list should fix the problem: self.hypernet_params = list(filter(lambda p: id(p) not in backbone_params, self.model_hyper.parameters()))
引用的那篇文章好像没有提到
The data augmentation part of this code is different from the implementation details. It says: "we randomly sample and horizontally flipping 25 patches with size 224x224 pixels ... ", and "During testing stage, 25 patches with 224x224 pixels from test image are randomly sampled and their corresponding ... ", but I can not find the part about randomly crop 25 patches, can you point out where this part is? What I find just the normal torchvision.transforms.Compose from data_loader.py instead of randomly crop 25 patches.
hello, thank you for sharing your great work. I have some question about the performance of WaDIQaM, as shown in Table1, it achieved 0.955 SRCC and 0.973 PLCC on CSIQ database, how did you get the score? Was it trained on LIVE database?
Can you share the CSIQ database? The website http://vision.okstate.edu/csiq/ has broken。
where are code?
Lines 31 to 35 in 685d4af
Line 191 in 685d4af
Plus: according to the document of pytorch Resize,the 'size' parameter of the Resize function refers to the height and width.
Hello! We kindly invite you to participate in our video quality metrics benchmark. You can submit hyperIQA to the benchmark, following the submission steps, described here. The dataset distortions refer to compression artifacts on professional and user-generated content. The full dataset is used to measure methods overall performance, so we do not share it to avoid overfitting. Nevertheless, we provided the open part of it (around 1,000 videos) within our paper "Video compression dataset and benchmark of learning-based video-quality metrics", accepted to NeurIPS 2022.
Dear author
Could you please share the trained model for easier comparision and study?
Thank you very much
hi,有一个问题,如果待评价的一组图像尺寸不一致,demo中torchvision.transforms.Resize((512, 384))会不会使得一些图像失真,影响评价结果?
hello,su. Can you share the GFIQA dataset from "Going the Extra Mile in Face Image Quality Assessment: A Novel Database and Model". The link http://database.mmsp-kn.de/gfiqa-20k-database.html is broken.
Thank you very much for your work!I have some questions to ask you.Please help me.In the hyperiqasolver.py file, why Param. Requires_grad = False? Look forward to your answers.
model_target = models.TargetNet(paras).cuda()
for param in model_target.parameters():
param.requires_grad = False
I used some data of my own to train the model and want to calculate the score of images one by one.
During the trainning, I added some codes to calculate the l1-norm, such as the codes below
`for img, label in data:
img = img.cuda().clone().detach()
label = label.cuda(async=True).clone().detach()
paras = self.model_hyper(img)
model_target = models.TargetNet(paras).cuda()
model_target.train(False)
pred = model_target(paras['target_in_vec'])
pred_scores.append(float(pred.item()))
gt_scores = gt_scores + label.cpu().tolist()
pred_scores_np = np.array(pred_scores,dtype=np.float32)
gt_scores_np = np.array(gt_scores,dtype=np.float32)
l1_norm_test = np.absolute(pred_scores_np-gt_scores_np)
l1_norm_test = np.sum(l1_norm_test)/len(l1_norm_test)`
Then I wrote the inference codes su as the codes below
`mean_RGB = [123.675, 116.28, 103.53]
std_RGB = [58.395, 57.12, 57.375]
model_hyper = models.HyperNet(16, 112, 224, 112, 56, 28, 14, 7).cuda()
model_hyper.load_state_dict(torch.load(args.pretrained_model_name_hyper))
model_hyper.train(False)
I = Image.open(imgName)
I = I.convert("RGB")
I_ = I.resize((224,224))
I_np = np.asarray(I_,dtype=np.float32).copy()
I_np[:,:,0] = (I_np[:,:,0]-mean_RGB[0])/std_RGB[0]
I_np[:,:,1] = (I_np[:,:,1]-mean_RGB[1])/std_RGB[1]
I_np[:,:,2] = (I_np[:,:,2]-mean_RGB[2])/std_RGB[2]
I_np = I_np.transpose(2,0,1)
with torch.no_grad():
input_var = torch.from_numpy(I_np).unsqueeze(0)
input_var = Variable(input_var.float().cuda(0), volatile=True)
paras = model_hyper(input_var)
model_target = models.TargetNet(paras).cuda()
model_target.load_state_dict(torch.load(args.pretrained_model_name_target))
model_target.train(False)
pred = model_target(paras['target_in_vec']).cpu()`
Here, pretrained_model_name_hyper and pretrained_model_name_target are hyper and target models saved during trainning.
During the training, I got the minimum average l1-norm 2.88, but in the test, I got 11.13.
Is there anything wrong with the codes?
I guess it may have some problem in the loading process of the pretrained target model.
I retrained the model with koniq10k and found that the performance of the model cann't match the performance of you provided.
in the paper mentioned that:
Before calculating PLCC, logistic regression is first
applied to remove nonlinear rating caused by human visual
observation, as suggested in the report from Video Quality
Expert Group (VQEG) [11].
can you please share the logistic regression function that is used?
hey, thanks for your great work.
i viewed code and i found that there's no label normalization, e.g. normalizing scores to range [0, 1]. it's ok not to normalize when train and test on the same dataset or datasets with similar range.
in the paper, table 3 lists 3 datasets (livec, bid and knoiq) which have different score ranges. is it reasonable to use raw scores? or maybe you have normalized scores?
look forward to your reply.
Hello, thank you for your valuable code. In HyperIQA training and testing, they "randomly sample and horizontally flipping 25 patches with size 224×224 pixels from each training image for augmentation." and "During testing stage, 25 patches with 224×224 pixels from test image are randomly sampled and their corresponding prediction scores are average pooled to get the final quality score."
I wonder how can i train and test my model or HyperIQA as what they do in HyperIQA ?
如题,这样会不会影响效果呢
could you please provide a valid GoogleDrive link for pre-trained model? I can't download it from the Biadu link. Thanks
How to use one more GPUs??
thankyou!
such as c1 = 0, c2 = 0, c3 = 25, c4 = 73, c5 = 7, c_total = 105, MOS = 3.83, SD = 0.53
Hello! Thanks for the code, but I can not find the "csiq_label.txt", can you release it?
Hello, I found that when I use your pre-trained model to evaluate noise-free images and noisy images (noise type is pretzel noise, noise intensity is 0.1), the evaluation score of noisy images is higher than the evaluation score of corresponding noise-free images. How can I fix it?
we find your models.py that have a code: res_out = self.res(img).The res and img is not defined。Can you explain these
Hi, thanks for your IQA work!
I noticed that you resize the input image of koniq-10k dataset.
Since there are two resolutions of data in koniq-10k dataset, if you use 512 x 384 as input size, why not using 512 x 384 image size provided by koniq-10k ?
您好
在计算PLCC之前,我看您论文上写了这么一段:Before calculating PLCC, logistic regression is first applied to remove nonlinear rating caused by human visual observation, as suggested in the report from Video Quality Expert Group (VQEG).
但我在阅读您提供的代码后,并没有发现这一部分的操作🤔
请问这一步体现在代码的哪一步呢
could you please release the training dataset?Thank you !
Hi, thanks for your work!
I have a question that if you would like resume training, how to deal with the optimizer from the latest status. I see something about optimizer updating in the code, but it seems that the optimizer cannot be updated when you resume training, no?
I found that different split way of the dataset plays an important part in the result of SRCC and PLCC. Can you make the split results of the data public?
hello do you know where i can download the KONIQ-10K dataset? thank you very much
I used the train_test_IQA.py training CLIVE data set as you suggested, but the result was not the same as what you showed in your paper. The parameters were set according to the default value of the code you gave, such as EPOchs =16, etc. May I ask if you have taken the highest value of the training model for many times in your paper? Could you please explain my doubts?
I want to train the model on LIVE database, but there is an error about 'dmos_realigned.mat'. The code is here
''dmos = scipy.io.loadmat(os.path.join(root, 'dmos_realigned.mat'))
labels = dmos['dmos_new'].astype(np.float32)''
Is it correct for me to replace 'dmos_realigned.mat' with ‘dmos.mat' and 'dmos_new' with 'dmos'(using the mat file in live database) , or have you processed the data in any other way?
Thank you
Hello! Where can I download the BID database?
It is hard to reproduce the results mentioned in the paper, request you to release the code.
Hello! Running the paper as set up, I got the following error. It seems that the predicted values appear to be nan, why?
File "/home/wxq/project/NR_IQA/hyperIQA-master/hyperIQA-master/HyerIQASolver.py", line 112, in test
test_plcc, _ = stats.pearsonr(pred_scores, gt_scores)
File "/home/wxq/.local/lib/python3.8/site-packages/scipy/stats/stats.py", line 3530, in pearsonr
normxm = linalg.norm(xm)
File "/home/wxq/.local/lib/python3.8/site-packages/scipy/linalg/misc.py", line 142, in norm
a = np.asarray_chkfinite(a)
File "/home/wxq/conda/lib/python3.8/site-packages/numpy/lib/function_base.py", line 485, in asarray_chkfinite
raise ValueError(
ValueError: array must not contain infs or NaNs
I want to know how to test an image and obtain the score. There is test function in your HyerIQASovler.py, but I don't know how to do proprocess because data_loader.py needs to input information such as database, but I just want to test a single image.
Hi. I ran demo.py with same image and same parameters.
But results were changed every time.
Why this happened?
why need set train_patch_num,common images been repeatedly trained without changed???
Lines 50 to 56 in 3c4fe13
Hi, thanks for releasing your code. I find that you set requires_grad=False for target_networks. This is a little confusing. Since the parameter of target network are generated by hyperNet, would setting requires_grad to False block loss of targetnet backpropogating to parameters of hyperNet and make the parameters of hypernet do not update?
您好,想请问下对于真实场景下模糊,比如局部模糊,用 mos 分数作为每一个随机 patch 的质量分是否合理呢?
thanks for your code. I want to train the code and save the model and then to test. However, I'm not familiar with pytorch. I don't understand how to save the model in your code or the storage location after saving. Please help me solve it. thank you very much.
Thanks for releasing your code!but I have a question about koniq-10k and bid dataset。In data_loader.py,why would you resize the image size to (512,384) for koniq-10k dataset and resize the image size to (512,512) for bid dataset?
Epoch Train_Loss Train_SROCC Test_SROCC Test_PLCC
1 5.168 0.8779 0.9013 0.9201
2 3.330 0.9497 0.9023 0.9192
3 2.853 0.9629 0.9011 0.9181
4 2.557 0.9700 0.9004 0.9179
5 2.349 0.9746 0.8970 0.9123
6 2.185 0.9779 0.9007 0.9161
7 2.053 0.9805 0.8949 0.9121
8 1.934 0.9827 0.8949 0.9094
9 1.833 0.9844 0.8967 0.9135
10 1.764 0.9856 0.8946 0.9115
when i trained koniq-10k datasets,the Test_SROCC and Test_PLCC is descended by the epoch increased,and the train_SROCC is increased,if it is overfitting??
Pytorch中的torchvision.transforms.Resize((512, 384)),会把输出resize成:height 512, width 384 而原图的size为 height 768, width 1024,长宽比相反,不会对结果造成什么影响吗?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.