Giter Club home page Giter Club logo

hyperiqa's People

Contributors

ssl92 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

hyperiqa's Issues

Push for code~

It seems to be a really attractive work for IQA, can't wait for the codes ~

HyperNetWork is not being trained

With respect to the file "HyerIQASolver.py"

In line 20 you store the parameter of the HyperNetwork to be trained in the variable "self.hypernet_params".

self.hypernet_params = filter(lambda p: id(p) not in backbone_params, self.model_hyper.parameters())

From line 77 to 84 you update the optimizer with this variable self.hypernet_params

Since self.model_hyper.parameters() returns an iterator, self.hypernet_params is an iterator and after the optimizer is initialized, it becomes empty. Therefore, after the first epoch, the optimizer doesn't optimize the HyperNetwork anymore because self.hypernet_params is empty.

Setting self.hypernet_params as a list should fix the problem: self.hypernet_params = list(filter(lambda p: id(p) not in backbone_params, self.model_hyper.parameters()))

Problems of Implementation Details. Some difference?

The data augmentation part of this code is different from the implementation details. It says: "we randomly sample and horizontally flipping 25 patches with size 224x224 pixels ... ", and "During testing stage, 25 patches with 224x224 pixels from test image are randomly sampled and their corresponding ... ", but I can not find the part about randomly crop 25 patches, can you point out where this part is? What I find just the normal torchvision.transforms.Compose from data_loader.py instead of randomly crop 25 patches.

Question about the performance of SOTA method.

hello, thank you for sharing your great work. I have some question about the performance of WaDIQaM, as shown in Table1, it achieved 0.955 SRCC and 0.973 PLCC on CSIQ database, how did you get the score? Was it trained on LIVE database?

about data process

hyperIQA/data_loader.py

Lines 31 to 35 in 685d4af

elif dataset == 'koniq-10k':
if istrain:
transforms = torchvision.transforms.Compose([
torchvision.transforms.RandomHorizontalFlip(),
torchvision.transforms.Resize((512, 384)),

sample.append((os.path.join(root, '1024x768', imgname[item]), mos_all[item]))

From the two code snippets above, it is evident that the data is loaded as images with a size of '1024x768', with width and height dimensions of 1024 and 768 respectively. However, the Resize((512, 384)) operation rescales the dimensions to 512 and 384, resulting in a noticeable change in the aspect ratio from the original 768:1024 to 512:384. I'm curious if the same processing is applied in the experimental setup of the paper?

Plus: according to the document of pytorch Resize,the 'size' parameter of the Resize function refers to the height and width.

MSU Video Quality Metrics Benchmark Invitation

Hello! We kindly invite you to participate in our video quality metrics benchmark. You can submit hyperIQA to the benchmark, following the submission steps, described here. The dataset distortions refer to compression artifacts on professional and user-generated content. The full dataset is used to measure methods overall performance, so we do not share it to avoid overfitting. Nevertheless, we provided the open part of it (around 1,000 videos) within our paper "Video compression dataset and benchmark of learning-based video-quality metrics", accepted to NeurIPS 2022.

trained model

Dear author

Could you please share the trained model for easier comparision and study?

Thank you very much

关于demo中resize问题

hi,有一个问题,如果待评价的一组图像尺寸不一致,demo中torchvision.transforms.Resize((512, 384))会不会使得一些图像失真,影响评价结果?

In the hyperiqasolver.py file, why Param. Requires_grad = False?

Thank you very much for your work!I have some questions to ask you.Please help me.In the hyperiqasolver.py file, why Param. Requires_grad = False? Look forward to your answers.
model_target = models.TargetNet(paras).cuda()
for param in model_target.parameters():
param.requires_grad = False

About inference code

I used some data of my own to train the model and want to calculate the score of images one by one.

During the trainning, I added some codes to calculate the l1-norm, such as the codes below

   `for img, label in data:           
        
        img = img.cuda().clone().detach()
        label = label.cuda(async=True).clone().detach()
        
        paras = self.model_hyper(img)
        model_target = models.TargetNet(paras).cuda()
        model_target.train(False)
        pred = model_target(paras['target_in_vec'])

        pred_scores.append(float(pred.item()))
        gt_scores = gt_scores + label.cpu().tolist()
    
    pred_scores_np = np.array(pred_scores,dtype=np.float32)
    gt_scores_np = np.array(gt_scores,dtype=np.float32)    
    l1_norm_test = np.absolute(pred_scores_np-gt_scores_np)
    l1_norm_test = np.sum(l1_norm_test)/len(l1_norm_test)`

Then I wrote the inference codes su as the codes below

      `mean_RGB = [123.675, 116.28, 103.53]
       std_RGB =  [58.395, 57.12, 57.375]

        model_hyper = models.HyperNet(16, 112, 224, 112, 56, 28, 14, 7).cuda()
        model_hyper.load_state_dict(torch.load(args.pretrained_model_name_hyper))
        model_hyper.train(False)
        I = Image.open(imgName)
        I = I.convert("RGB")  
        I_ = I.resize((224,224))
       
        I_np = np.asarray(I_,dtype=np.float32).copy()
       
        I_np[:,:,0] = (I_np[:,:,0]-mean_RGB[0])/std_RGB[0]
        I_np[:,:,1] = (I_np[:,:,1]-mean_RGB[1])/std_RGB[1]
        I_np[:,:,2] = (I_np[:,:,2]-mean_RGB[2])/std_RGB[2]
       
        I_np = I_np.transpose(2,0,1)
       
        with torch.no_grad():
        
             input_var = torch.from_numpy(I_np).unsqueeze(0)
             input_var = Variable(input_var.float().cuda(0), volatile=True)
             
             paras = model_hyper(input_var)
             
             model_target = models.TargetNet(paras).cuda()
             model_target.load_state_dict(torch.load(args.pretrained_model_name_target))
             model_target.train(False)
             
             pred = model_target(paras['target_in_vec']).cpu()`

Here, pretrained_model_name_hyper and pretrained_model_name_target are hyper and target models saved during trainning.

During the training, I got the minimum average l1-norm 2.88, but in the test, I got 11.13.
Is there anything wrong with the codes?

I guess it may have some problem in the loading process of the pretrained target model.

How to train model with koniq10k

I retrained the model with koniq10k and found that the performance of the model cann't match the performance of you provided.

regarding the logistic regression before plcc

in the paper mentioned that:

Before calculating PLCC, logistic regression is first
applied to remove nonlinear rating caused by human visual
observation, as suggested in the report from Video Quality
Expert Group (VQEG) [11].

can you please share the logistic regression function that is used?

about label/score normalization

hey, thanks for your great work.
i viewed code and i found that there's no label normalization, e.g. normalizing scores to range [0, 1]. it's ok not to normalize when train and test on the same dataset or datasets with similar range.
in the paper, table 3 lists 3 datasets (livec, bid and knoiq) which have different score ranges. is it reasonable to use raw scores? or maybe you have normalized scores?
look forward to your reply.

HyperIQA resize

Hello, thank you for your valuable code. In HyperIQA training and testing, they "randomly sample and horizontally flipping 25 patches with size 224×224 pixels from each training image for augmentation." and "During testing stage, 25 patches with 224×224 pixels from test image are randomly sampled and their corresponding prediction scores are average pooled to get the final quality score."
I wonder how can i train and test my model or HyperIQA as what they do in HyperIQA ?

Evaluation of noise image quality

Hello, I found that when I use your pre-trained model to evaluate noise-free images and noisy images (noise type is pretzel noise, noise intensity is 0.1), the evaluation score of noisy images is higher than the evaluation score of corresponding noise-free images. How can I fix it?

coda question

we find your models.py that have a code: res_out = self.res(img).The res and img is not defined。Can you explain these

Why not using 512 x 384 images of koniq10k to avoid resizing?

Hi, thanks for your IQA work!
I noticed that you resize the input image of koniq-10k dataset.
Since there are two resolutions of data in koniq-10k dataset, if you use 512 x 384 as input size, why not using 512 x 384 image size provided by koniq-10k ?

关于PLCC的计算

您好
在计算PLCC之前,我看您论文上写了这么一段:Before calculating PLCC, logistic regression is first applied to remove nonlinear rating caused by human visual observation, as suggested in the report from Video Quality Expert Group (VQEG).
但我在阅读您提供的代码后,并没有发现这一部分的操作🤔
请问这一步体现在代码的哪一步呢

Resume training

Hi, thanks for your work!
I have a question that if you would like resume training, how to deal with the optimizer from the latest status. I see something about optimizer updating in the code, but it seems that the optimizer cannot be updated when you resume training, no?

koniq-10k dataset

hello do you know where i can download the KONIQ-10K dataset? thank you very much

The reproduced result is not consistent with your paper. Is there any problem ?

I used the train_test_IQA.py training CLIVE data set as you suggested, but the result was not the same as what you showed in your paper. The parameters were set according to the default value of the code you gave, such as EPOchs =16, etc. May I ask if you have taken the highest value of the training model for many times in your paper? Could you please explain my doubts?

the 'dmos_realigned.mat' of LIVE database

I want to train the model on LIVE database, but there is an error about 'dmos_realigned.mat'. The code is here
''dmos = scipy.io.loadmat(os.path.join(root, 'dmos_realigned.mat'))
labels = dmos['dmos_new'].astype(np.float32)''

Is it correct for me to replace 'dmos_realigned.mat' with ‘dmos.mat' and 'dmos_new' with 'dmos'(using the mat file in live database) , or have you processed the data in any other way?

Thank you

Please release the code

It is hard to reproduce the results mentioned in the paper, request you to release the code.

stats.pearsonr Error?

Hello! Running the paper as set up, I got the following error. It seems that the predicted values appear to be nan, why?
File "/home/wxq/project/NR_IQA/hyperIQA-master/hyperIQA-master/HyerIQASolver.py", line 112, in test
test_plcc, _ = stats.pearsonr(pred_scores, gt_scores)
File "/home/wxq/.local/lib/python3.8/site-packages/scipy/stats/stats.py", line 3530, in pearsonr
normxm = linalg.norm(xm)
File "/home/wxq/.local/lib/python3.8/site-packages/scipy/linalg/misc.py", line 142, in norm
a = np.asarray_chkfinite(a)
File "/home/wxq/conda/lib/python3.8/site-packages/numpy/lib/function_base.py", line 485, in asarray_chkfinite
raise ValueError(
ValueError: array must not contain infs or NaNs

how to test an image and get the score?

I want to know how to test an image and obtain the score. There is test function in your HyerIQASovler.py, but I don't know how to do proprocess because data_loader.py needs to input information such as database, but I just want to test a single image.

Value changes every time

Hi. I ran demo.py with same image and same parameters.
But results were changed every time.
Why this happened?

train_patch_num

why need set train_patch_num,common images been repeatedly trained without changed???

Why setting 'params.requires_grad=False' for target_network?

# Generate weights for target network
paras = self.model_hyper(img) # 'paras' contains the network weights conveyed to target network
# Building target network
model_target = models.TargetNet(paras).cuda()
for param in model_target.parameters():
param.requires_grad = False

Hi, thanks for releasing your code. I find that you set requires_grad=False for target_networks. This is a little confusing. Since the parameter of target network are generated by hyperNet, would setting requires_grad to False block loss of targetnet backpropogating to parameters of hyperNet and make the parameters of hypernet do not update?

about saving model

thanks for your code. I want to train the code and save the model and then to test. However, I'm not familiar with pytorch. I don't understand how to save the model in your code or the storage location after saving. Please help me solve it. thank you very much.

A question about koniq-10k and bid dataset

Thanks for releasing your code!but I have a question about koniq-10k and bid dataset。In data_loader.py,why would you resize the image size to (512,384) for koniq-10k dataset and resize the image size to (512,512) for bid dataset?

why test_ROCC and test PLCC descend

Epoch Train_Loss Train_SROCC Test_SROCC Test_PLCC
1 5.168 0.8779 0.9013 0.9201
2 3.330 0.9497 0.9023 0.9192
3 2.853 0.9629 0.9011 0.9181
4 2.557 0.9700 0.9004 0.9179
5 2.349 0.9746 0.8970 0.9123
6 2.185 0.9779 0.9007 0.9161
7 2.053 0.9805 0.8949 0.9121
8 1.934 0.9827 0.8949 0.9094
9 1.833 0.9844 0.8967 0.9135
10 1.764 0.9856 0.8946 0.9115

when i trained koniq-10k datasets,the Test_SROCC and Test_PLCC is descended by the epoch increased,and the train_SROCC is increased,if it is overfitting??

About resize in Koniq-10k dataset transforms

Pytorch中的torchvision.transforms.Resize((512, 384)),会把输出resize成:height 512, width 384 而原图的size为 height 768, width 1024,长宽比相反,不会对结果造成什么影响吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.