ssl92 / hyperiqa Goto Github PK

View Code? Open in Web Editor NEW

354.0 354.0 52.0 2.09 MB

Source code for the CVPR'20 paper "Blindly Assess Image Quality in the Wild Guided by A Self-Adaptive Hyper Network"

License: MIT License

Python 100.00%

hyperiqa's People

Contributors

Stargazers

Watchers

hyperiqa's Issues

Problems of Implementation Details. Some difference?

The data augmentation part of this code is different from the implementation details. It says: "we randomly sample and horizontally flipping 25 patches with size 224x224 pixels ... ", and "During testing stage, 25 patches with 224x224 pixels from test image are randomly sampled and their corresponding ... ", but I can not find the part about randomly crop 25 patches, can you point out where this part is? What I find just the normal torchvision.transforms.Compose from data_loader.py instead of randomly crop 25 patches.

Can you share the GFIQA dataset?

hello,su. Can you share the GFIQA dataset from "Going the Extra Mile in Face Image Quality Assessment: A Novel Database and Model". The link http://database.mmsp-kn.de/gfiqa-20k-database.html is broken.

how to test an image and get the score?

I want to know how to test an image and obtain the score. There is test function in your HyerIQASovler.py, but I don't know how to do proprocess because data_loader.py needs to input information such as database, but I just want to test a single image.

论文中提到的测试数据集BID在哪下？

引用的那篇文章好像没有提到

why test_ROCC and test PLCC descend

Epoch Train_Loss Train_SROCC Test_SROCC Test_PLCC
1 5.168 0.8779 0.9013 0.9201
2 3.330 0.9497 0.9023 0.9192
3 2.853 0.9629 0.9011 0.9181
4 2.557 0.9700 0.9004 0.9179
5 2.349 0.9746 0.8970 0.9123
6 2.185 0.9779 0.9007 0.9161
7 2.053 0.9805 0.8949 0.9121
8 1.934 0.9827 0.8949 0.9094
9 1.833 0.9844 0.8967 0.9135
10 1.764 0.9856 0.8946 0.9115

when i trained koniq-10k datasets,the Test_SROCC and Test_PLCC is descended by the epoch increased，and the train_SROCC is increased,if it is overfitting??

Evaluation of noise image quality

Hello, I found that when I use your pre-trained model to evaluate noise-free images and noisy images (noise type is pretzel noise, noise intensity is 0.1), the evaluation score of noisy images is higher than the evaluation score of corresponding noise-free images. How can I fix it?

Value changes every time

Hi. I ran demo.py with same image and same parameters.
But results were changed every time.
Why this happened?

HyperIQA resize

Hello, thank you for your valuable code. In HyperIQA training and testing, they "randomly sample and horizontally flipping 25 patches with size 224×224 pixels from each training image for augmentation." and "During testing stage, 25 patches with 224×224 pixels from test image are randomly sampled and their corresponding prediction scores are average pooled to get the final quality score."
I wonder how can i train and test my model or HyperIQA as what they do in HyperIQA ?

The reproduced result is not consistent with your paper. Is there any problem ?

I used the train_test_IQA.py training CLIVE data set as you suggested, but the result was not the same as what you showed in your paper. The parameters were set according to the default value of the code you gave, such as EPOchs =16, etc. May I ask if you have taken the highest value of the training model for many times in your paper? Could you please explain my doubts?

About inference code

I used some data of my own to train the model and want to calculate the score of images one by one.

During the trainning, I added some codes to calculate the l1-norm, such as the codes below

   `for img, label in data:           
        
        img = img.cuda().clone().detach()
        label = label.cuda(async=True).clone().detach()
        
        paras = self.model_hyper(img)
        model_target = models.TargetNet(paras).cuda()
        model_target.train(False)
        pred = model_target(paras['target_in_vec'])

        pred_scores.append(float(pred.item()))
        gt_scores = gt_scores + label.cpu().tolist()
    
    pred_scores_np = np.array(pred_scores,dtype=np.float32)
    gt_scores_np = np.array(gt_scores,dtype=np.float32)    
    l1_norm_test = np.absolute(pred_scores_np-gt_scores_np)
    l1_norm_test = np.sum(l1_norm_test)/len(l1_norm_test)`

Then I wrote the inference codes su as the codes below

      `mean_RGB = [123.675, 116.28, 103.53]
       std_RGB =  [58.395, 57.12, 57.375]

        model_hyper = models.HyperNet(16, 112, 224, 112, 56, 28, 14, 7).cuda()
        model_hyper.load_state_dict(torch.load(args.pretrained_model_name_hyper))
        model_hyper.train(False)
        I = Image.open(imgName)
        I = I.convert("RGB")  
        I_ = I.resize((224,224))
       
        I_np = np.asarray(I_,dtype=np.float32).copy()
       
        I_np[:,:,0] = (I_np[:,:,0]-mean_RGB[0])/std_RGB[0]
        I_np[:,:,1] = (I_np[:,:,1]-mean_RGB[1])/std_RGB[1]
        I_np[:,:,2] = (I_np[:,:,2]-mean_RGB[2])/std_RGB[2]
       
        I_np = I_np.transpose(2,0,1)
       
        with torch.no_grad():
        
             input_var = torch.from_numpy(I_np).unsqueeze(0)
             input_var = Variable(input_var.float().cuda(0), volatile=True)
             
             paras = model_hyper(input_var)
             
             model_target = models.TargetNet(paras).cuda()
             model_target.load_state_dict(torch.load(args.pretrained_model_name_target))
             model_target.train(False)
             
             pred = model_target(paras['target_in_vec']).cpu()`

Here, pretrained_model_name_hyper and pretrained_model_name_target are hyper and target models saved during trainning.

During the training, I got the minimum average l1-norm 2.88, but in the test, I got 11.13.
Is there anything wrong with the codes?

I guess it may have some problem in the loading process of the pretrained target model.

关于PLCC的计算

您好
在计算PLCC之前，我看您论文上写了这么一段：Before calculating PLCC, logistic regression is first applied to remove nonlinear rating caused by human visual observation, as suggested in the report from Video Quality Expert Group (VQEG).
但我在阅读您提供的代码后，并没有发现这一部分的操作🤔
请问这一步体现在代码的哪一步呢

The rationality of sample patches randomly.

您好，想请问下对于真实场景下模糊，比如局部模糊，用 mos 分数作为每一个随机 patch 的质量分是否合理呢？

About resize in Koniq-10k dataset transforms

Pytorch中的torchvision.transforms.Resize((512, 384))，会把输出resize成：height 512， width 384 而原图的size为 height 768, width 1024，长宽比相反，不会对结果造成什么影响吗？

about saving model

thanks for your code. I want to train the code and save the model and then to test. However, I'm not familiar with pytorch. I don't understand how to save the model in your code or the storage location after saving. Please help me solve it. thank you very much.

HyperNetWork is not being trained

With respect to the file "HyerIQASolver.py"

In line 20 you store the parameter of the HyperNetwork to be trained in the variable "self.hypernet_params".

self.hypernet_params = filter(lambda p: id(p) not in backbone_params, self.model_hyper.parameters())

From line 77 to 84 you update the optimizer with this variable self.hypernet_params

Since self.model_hyper.parameters() returns an iterator, self.hypernet_params is an iterator and after the optimizer is initialized, it becomes empty. Therefore, after the first epoch, the optimizer doesn't optimize the HyperNetwork anymore because self.hypernet_params is empty.

Setting self.hypernet_params as a list should fix the problem: self.hypernet_params = list(filter(lambda p: id(p) not in backbone_params, self.model_hyper.parameters()))

MSU Video Quality Metrics Benchmark Invitation

Hello! We kindly invite you to participate in our video quality metrics benchmark. You can submit hyperIQA to the benchmark, following the submission steps, described here. The dataset distortions refer to compression artifacts on professional and user-generated content. The full dataset is used to measure methods overall performance, so we do not share it to avoid overfitting. Nevertheless, we provided the open part of it (around 1,000 videos) within our paper "Video compression dataset and benchmark of learning-based video-quality metrics", accepted to NeurIPS 2022.

How to use one more GPUS

How to use one more GPUs??
thankyou!

where is the "csiq_label.txt"?

Hello! Thanks for the code, but I can not find the "csiq_label.txt", can you release it?

Can you share the CSIQ database?

Can you share the CSIQ database? The website http://vision.okstate.edu/csiq/ has broken。

How to calculate MOS_zscore in koniq10k datasets?

such as c1 = 0, c2 = 0, c3 = 25, c4 = 73, c5 = 7, c_total = 105, MOS = 3.83, SD = 0.53

code?

where are code?

Why setting 'params.requires_grad=False' for target_network?

hyperIQA/HyerIQASolver.py

Lines 50 to 56 in 3c4fe13

 # Generate weights for target network 

 paras = self.model_hyper(img) # 'paras' contains the network weights conveyed to target network 

 # Building target network 

 model_target = models.TargetNet(paras).cuda() 

 for param in model_target.parameters(): 

 param.requires_grad = False

Hi, thanks for releasing your code. I find that you set requires_grad=False for target_networks. This is a little confusing. Since the parameter of target network are generated by hyperNet, would setting requires_grad to False block loss of targetnet backpropogating to parameters of hyperNet and make the parameters of hypernet do not update?

关于demo中resize问题

hi，有一个问题，如果待评价的一组图像尺寸不一致，demo中torchvision.transforms.Resize((512, 384))会不会使得一些图像失真，影响评价结果？

regarding the logistic regression before plcc

in the paper mentioned that:

Before calculating PLCC, logistic regression is first
applied to remove nonlinear rating caused by human visual
observation, as suggested in the report from Video Quality
Expert Group (VQEG) [11].

can you please share the logistic regression function that is used?

coda question

we find your models.py that have a code: res_out = self.res(img).The res and img is not defined。Can you explain these

About the random split way of the dataset

I found that different split way of the dataset plays an important part in the result of SRCC and PLCC. Can you make the split results of the data public?

.cpu可以改成.cuda吗

Resume training

Hi, thanks for your work!
I have a question that if you would like resume training, how to deal with the optimizer from the latest status. I see something about optimizer updating in the code, but it seems that the optimizer cannot be updated when you resume training, no?

GoogleDrive link for pre-trained model

could you please provide a valid GoogleDrive link for pre-trained model? I can't download it from the Biadu link. Thanks

A question about koniq-10k and bid dataset

Thanks for releasing your code！but I have a question about koniq-10k and bid dataset。In data_loader.py，why would you resize the image size to (512,384) for koniq-10k dataset and resize the image size to (512,512) for bid dataset?

How to train model with koniq10k

I retrained the model with koniq10k and found that the performance of the model cann't match the performance of you provided.

Sometimes the score will be greater than 100？

could you please release the training dataset？

could you please release the training dataset？Thank you ！

trained model

Dear author

Could you please share the trained model for easier comparision and study?

Thank you very much

train_patch_num

why need set train_patch_num,common images been repeatedly trained without changed???

Can you provide a web address for the KonIQ-10k database? The official link won't download.

koniq-10k dataset

hello do you know where i can download the KONIQ-10K dataset? thank you very much

about data process

hyperIQA/data_loader.py

Lines 31 to 35 in 685d4af

 elif dataset == 'koniq-10k': 

 if istrain: 

 transforms = torchvision.transforms.Compose([ 

 torchvision.transforms.RandomHorizontalFlip(), 

 torchvision.transforms.Resize((512, 384)),

hyperIQA/folders.py

Line 191 in 685d4af

sample.append((os.path.join(root, '1024x768', imgname[item]), mos_all[item]))

From the two code snippets above, it is evident that the data is loaded as images with a size of '1024x768', with width and height dimensions of 1024 and 768 respectively. However, the Resize((512, 384)) operation rescales the dimensions to 512 and 384, resulting in a noticeable change in the aspect ratio from the original 768:1024 to 512:384. I'm curious if the same processing is applied in the experimental setup of the paper?

Plus: according to the document of pytorch Resize,the 'size' parameter of the Resize function refers to the height and width.

Question about the performance of SOTA method.

hello, thank you for sharing your great work. I have some question about the performance of WaDIQaM, as shown in Table1, it achieved 0.955 SRCC and 0.973 PLCC on CSIQ database, how did you get the score? Was it trained on LIVE database?

可不可以改成全卷积网络，来适应不同大小图片

如题，这样会不会影响效果呢

Push for code~

It seems to be a really attractive work for IQA, can't wait for the codes ~

Why not using 512 x 384 images of koniq10k to avoid resizing?

Hi, thanks for your IQA work!
I noticed that you resize the input image of koniq-10k dataset.
Since there are two resolutions of data in koniq-10k dataset, if you use 512 x 384 as input size, why not using 512 x 384 image size provided by koniq-10k ?

Where can I download the BID database?

Hello! Where can I download the BID database?

Please release the code

It is hard to reproduce the results mentioned in the paper, request you to release the code.

In the hyperiqasolver.py file, why Param. Requires_grad = False?

Thank you very much for your work!I have some questions to ask you.Please help me.In the hyperiqasolver.py file, why Param. Requires_grad = False? Look forward to your answers.
model_target = models.TargetNet(paras).cuda()
for param in model_target.parameters():
param.requires_grad = False

stats.pearsonr Error？

Hello! Running the paper as set up, I got the following error. It seems that the predicted values appear to be nan, why?
File "/home/wxq/project/NR_IQA/hyperIQA-master/hyperIQA-master/HyerIQASolver.py", line 112, in test
test_plcc, _ = stats.pearsonr(pred_scores, gt_scores)
File "/home/wxq/.local/lib/python3.8/site-packages/scipy/stats/stats.py", line 3530, in pearsonr
normxm = linalg.norm(xm)
File "/home/wxq/.local/lib/python3.8/site-packages/scipy/linalg/misc.py", line 142, in norm
a = np.asarray_chkfinite(a)
File "/home/wxq/conda/lib/python3.8/site-packages/numpy/lib/function_base.py", line 485, in asarray_chkfinite
raise ValueError(
ValueError: array must not contain infs or NaNs

the 'dmos_realigned.mat' of LIVE database

I want to train the model on LIVE database, but there is an error about 'dmos_realigned.mat'. The code is here
''dmos = scipy.io.loadmat(os.path.join(root, 'dmos_realigned.mat'))
labels = dmos['dmos_new'].astype(np.float32)''

Is it correct for me to replace 'dmos_realigned.mat' with ‘dmos.mat' and 'dmos_new' with 'dmos'(using the mat file in live database) , or have you processed the data in any other way?

Thank you

about label/score normalization

hey, thanks for your great work.
i viewed code and i found that there's no label normalization, e.g. normalizing scores to range [0, 1]. it's ok not to normalize when train and test on the same dataset or datasets with similar range.
in the paper, table 3 lists 3 datasets (livec, bid and knoiq) which have different score ranges. is it reasonable to use raw scores? or maybe you have normalized scores?
look forward to your reply.

	# Generate weights for target network
	paras = self.model_hyper(img) # 'paras' contains the network weights conveyed to target network

	# Building target network
	model_target = models.TargetNet(paras).cuda()
	for param in model_target.parameters():
	param.requires_grad = False

	elif dataset == 'koniq-10k':
	if istrain:
	transforms = torchvision.transforms.Compose([
	torchvision.transforms.RandomHorizontalFlip(),
	torchvision.transforms.Resize((512, 384)),

ssl92 / hyperiqa Goto Github PK

hyperiqa's People

Contributors

Stargazers

Watchers

Forkers

hyperiqa's Issues

Recommend Projects

Recommend Topics

Recommend Org