Giter Club home page Giter Club logo

Comments (7)

ironheads avatar ironheads commented on August 29, 2024

你好,请问一下你的average pool的得到f_b与f_f是如何实现的?是对Encoder的feature进行Masked Average吗,我同样尝试了本文的模型,最终L_rgb 出现 nan,疑似出现了梯度爆炸

from cdtnet-high-resolution-image-harmonization.

taoxinhao13 avatar taoxinhao13 commented on August 29, 2024

@zhanghongyong123456 您好,不知您现在是否已经解决问题,看了您说的,损失函数和blending layer公式确实是没问题的,可能是您那边的损失函数实现存在一定问题?您也可以查看一下中间生成结果(如Pixel-2-Pixel Transformation后的结果以及RGB-2-RGB Transformation后的结果),查看是不是每个结果都有类似的问题。

from cdtnet-high-resolution-image-harmonization.

taoxinhao13 avatar taoxinhao13 commented on August 29, 2024

@ironheads 首先对于encoder最后一层的特征,使用1×1卷积进行一次映射得到f,然后把mask的大小缩到与f的尺寸相同得到m;之后通过对f×m和f×(1-m)进行average pool即得到f_b与f_f。您可以查看一下是突然变成nan了还是loss越来越大最后崩溃,如果是突然变成nan也许是您其他地方代码出现了除0这种操作

from cdtnet-high-resolution-image-harmonization.

ironheads avatar ironheads commented on August 29, 2024

是的,我这里解决了这个问题。之前的错误是由于引用论文的Adaptive 3D LUT的github公开实现似乎存在问题,python调用时是输入是[batch_size,3(rgb),width,height],但cuda cpp实现的是[3(rgb),batch_size,width,height]的输入。导致存在我调用时存在越界访问的情况。

另外对于f×m和f×(1-m)进行average pool意思是,直接对整个f的尺寸进行全局平均池化吗,我的实现类似是(fxm).sum()/m.sum(),(fx(1-m)).sum()/(1-m).summ(),缩小mask尺寸时使用的是AvgPool

from cdtnet-high-resolution-image-harmonization.

taoxinhao13 avatar taoxinhao13 commented on August 29, 2024

@ironheads 我懂您的意思,您这样做相当于是忽略不计算的部分,只对对应的部分取平均,确实更合理一些;我们的实现中是直接对f×m和f×(1-m)进行AvgPool到一个向量

from cdtnet-high-resolution-image-harmonization.

ironheads avatar ironheads commented on August 29, 2024

@taoxinhao13 请问你们的backbone (ISSAM)是从头开始训练的还是使用了pretrained weights

from cdtnet-high-resolution-image-harmonization.

taoxinhao13 avatar taoxinhao13 commented on August 29, 2024

@ironheads 使用了pretrained weights,用的就是对应的我们在论文中报的ISSAM的指标的那个模型,对应链接:https://github.com/saic-vul/image_harmonization/releases/download/v1.0/issam256.pth

from cdtnet-high-resolution-image-harmonization.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.