Giter Club home page Giter Club logo

Comments (12)

SSL92 avatar SSL92 commented on August 20, 2024 2

如果真的想测试小于224×224尺寸的图像,还是用resize比较好,但效果不一定准确,毕竟这种小尺寸的图像即使在人看来也很难判断其质量如何。

之所以把网络的输入统一到一个尺度,一是因为前置resnet网络的感受野是固定的,否则难以提取一些尺寸超越感受野图像的语义特征,二也是为了能把训练图像集成到同一个batch中。

from hyperiqa.

tengerye avatar tengerye commented on August 20, 2024 2

Hi, I have tried using various sizes of images for inference, but it raises error. It seems it does not support flexible input of image for now. @SSL92 @lllllllllllll-llll GAP may tackle the flexibility for Hyper Network, but can't deal with Target Network.

Hi! Did you use the demo.py to test the various sizes of images?

Hi, yes, but after remove torchvision.transforms.RandomCrop(size=224) from the code.

Hello! You can not remove this torchvision.transforms.RandomCrop(size=224) from the code, because the final prediction score is obtained by averaging the scores of these cropped 10 patches. If you consist of this, please remove the corresponding code for averaging the score of the cropped patches and only conduct the prediction progress once.

Hi @lllllllllllll-llll , thank you for your kind reply. I get your point, but I don't think it is a good way though.

  1. The network is essentially learning the patches, not the whole image;
  2. The prediction score is indeterministic for an identical image;
  3. For a very large image, the 10 patches can still be a small part of the area.

from hyperiqa.

SSL92 avatar SSL92 commented on August 20, 2024 1

Hi, I have tried using various sizes of images for inference, but it raises error. It seems it does not support flexible input of image for now. @SSL92 @lllllllllllll-llll GAP may tackle the flexibility for Hyper Network, but can't deal with Target Network.

Hi! Did you use the demo.py to test the various sizes of images?

Hi, yes, but after remove torchvision.transforms.RandomCrop(size=224) from the code.

Hello! You can not remove this torchvision.transforms.RandomCrop(size=224) from the code, because the final prediction score is obtained by averaging the scores of these cropped 10 patches. If you consist of this, please remove the corresponding code for averaging the score of the cropped patches and only conduct the prediction progress once.

Hi @lllllllllllll-llll , thank you for your kind reply. I get your point, but I don't think it is a good way though.

  1. The network is essentially learning the patches, not the whole image;
  2. The prediction score is indeterministic for an identical image;
  3. For a very large image, the 10 patches can still be a small part of the area.

Hi, it'll be really meaningful to expand the flexibility of the network input size, however, before that, perhaps two issues have to be considered:

  1. If the fixed size of model receptive field could capture image contents under various input sizes;

  2. How to assemble images with different sizes in a batch during training.

Thus in our model, we just fix the input size to 224x224 for convenience, but we also welcome any modifications to make it better : )

from hyperiqa.

lllllllllllll-llll avatar lllllllllllll-llll commented on August 20, 2024

本身这个网络已经可以适应不同大小图片了,采用crop成224x224的原因是用来做数据增强的。改成全卷积而没有全连接层,怎么做分数回归呢?而且本身的GAP操作就是用来适应不同size的。这是我的意见,哈哈。

from hyperiqa.

wangbin2018 avatar wangbin2018 commented on August 20, 2024

如果遇到图片小于224x224,不是需要resize么后才能让网络进行预测,这样resize过程已经影响了分数

from hyperiqa.

lllllllllllll-llll avatar lllllllllllll-llll commented on August 20, 2024

如果遇到图片小于224x224,不是需要resize么后才能让网络进行预测,这样resize过程已经影响了分数

224x224这个尺寸已经非常小了,为了验证自己方法的可行性最终我们都是要根据不同的数据集去进行相关的实验,这些数据集中图片的尺寸都没有小于224x224的,所以这个问题不必担心。

from hyperiqa.

wangbin2018 avatar wangbin2018 commented on August 20, 2024

如果实际想用这个模型,测试图片会有小于224x224的,我尝试resize或者补色块分数都会不一样

from hyperiqa.

lllllllllllll-llll avatar lllllllllllll-llll commented on August 20, 2024

224x224在实际情况下也是非常小了吧,没有什么实际内容,要真正应用的情况下还是得按照普遍的图片大小。

from hyperiqa.

tengerye avatar tengerye commented on August 20, 2024

Hi, I have tried using various sizes of images for inference, but it raises error. It seems it does not support flexible input of image for now. @SSL92 @lllllllllllll-llll GAP may tackle the flexibility for Hyper Network, but can't deal with Target Network.

from hyperiqa.

lllllllllllll-llll avatar lllllllllllll-llll commented on August 20, 2024

Hi, I have tried using various sizes of images for inference, but it raises error. It seems it does not support flexible input of image for now. @SSL92 @lllllllllllll-llll GAP may tackle the flexibility for Hyper Network, but can't deal with Target Network.

Hi! Did you use the demo.py to test the various sizes of images?

from hyperiqa.

tengerye avatar tengerye commented on August 20, 2024

Hi, I have tried using various sizes of images for inference, but it raises error. It seems it does not support flexible input of image for now. @SSL92 @lllllllllllll-llll GAP may tackle the flexibility for Hyper Network, but can't deal with Target Network.

Hi! Did you use the demo.py to test the various sizes of images?

Hi, yes, but after remove torchvision.transforms.RandomCrop(size=224) from the code.

from hyperiqa.

lllllllllllll-llll avatar lllllllllllll-llll commented on August 20, 2024

Hi, I have tried using various sizes of images for inference, but it raises error. It seems it does not support flexible input of image for now. @SSL92 @lllllllllllll-llll GAP may tackle the flexibility for Hyper Network, but can't deal with Target Network.

Hi! Did you use the demo.py to test the various sizes of images?

Hi, yes, but after remove torchvision.transforms.RandomCrop(size=224) from the code.

Hello! You can not remove this torchvision.transforms.RandomCrop(size=224) from the code, because the final prediction score is obtained by averaging the scores of these cropped 10 patches. If you consist of this, please remove the corresponding code for averaging the score of the cropped patches and only conduct the prediction progress once.

from hyperiqa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.