Giter Club home page Giter Club logo

bcmi / cdtnet-high-resolution-image-harmonization Goto Github PK

View Code? Open in Web Editor NEW
115.0 12.0 9.0 30.84 MB

[CVPR 2022] We unify pixel-to-pixel transformation and color-to-color transformation in a coherent framework for high-resolution image harmonization. We also release 100 high-resolution real composite images for evaluation.

Python 84.67% C++ 6.78% Cuda 7.22% Shell 0.53% C 0.60% Objective-C 0.20%
image-harmonization high-resolution-image-harmonization image-composition

cdtnet-high-resolution-image-harmonization's People

Contributors

mia-cong avatar taoxinhao13 avatar ustcnewly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cdtnet-high-resolution-image-harmonization's Issues

关于训练结果?

我依据ISSAM,查看训练的可视化结果:我没有看到有任何变化,设置 120个epoch,现在是12个epoch的结果,
249000_reconstruction
195000_reconstruction
158000_reconstruction
197000_reconstruction
我的损失函数采用了:L = L pix + L rgb + L ref . L1 loss,没有添加3DLUT 的tv_cons以及mncons loss计算(如下计算),
loss = mse + opt.lambda_smooth * (weights_norm + tv_cons) + opt.lambda_monotonicity * mn_cons
细化的最终输出参照小分辨率的输出设计:
output = attention_map * image + (1.0 - attention_map) * self.to_rgb(conv_2)

我不确定,是我设计有问题还是什么问题,这是三个L1 loss的训练损失
2022-03-26 17-48-36屏幕截图

使用HAdobe5k_2048.pth复现的结果比较奇怪

您好,非常棒的工作!

我想要用这个模型对custom image做推理,所以我按照

python3 evaluate_model.py CDTNet ./HAdobe5k_2048.pth --gpu 0 --datasets HAdobe5k --hr 2048 --lr 512 --save_dir ./CDTNet_2048_result

来进行测试。

我只测试了一张图,也就是说,HAdobe5k_test.txt里面只包含:

a3630_1_5.jpg
a3630_1_1.jpg
a3630_1_2.jpg
a3630_1_3.jpg
a3630_1_4.jpg

测试出来后指标结果如图:
image
图片结果的效果也比较差。

感觉自己是不是哪里设置错了,请问是我加载的模型不对吗?

关于网络细节

Our paper
2022-03-16 11-03-54屏幕截图
ISSAM paper
2022-03-16 11-04-22屏幕截图

想请教一下老师,这个像素转换的实现,仅仅包含ISSAM项目中的Encoder以及Decoder(第二张图像的)编解码还是说也包含了前半部分(HRNet + OCR)只是没有画出来,我看咱们论文(第一张图像)只有编解码的(Encoder和Decoder实现),是否只有编解码这部分,没有HRNet+OCR这部分

Request test results

Dear author, I am doing a comparative test recently, and I need the test results of your model, but I noticed that the online disk link of 256X256 on the iHarmony4 test set is broken, could you update it again? Thank you very much for receiving your reply!

About the memory cost described in the Introduction.

Hello there,

I have noticed that in the introduction of your paper, it is said that iDIH will cost more than 20GB memory when harmonize a 2048*2048 image. However, in our test, it seems it cost only about 2.5GB. We conduct the test as follows:

        with torch.no_grad():
            input_tmp = torch.randn(1, 3, 2048, 2048).cuda()
            mask_tmp = torch.randn(1, 1, 2048, 2048).cuda()
            start = torch.cuda.memory_allocated() / 1024 / 1024
            self.output = self.net(input_tmp,mask_tmp)
            end_max = torch.cuda.max_memory_allocated() / 1024 / 1024

            print("Max_memory:", (end_max-start))

Is there any wrong above? And I found that if I enable the grad, then the memory cost is about 20GB. So should I test that without the codeline "with torch.no_grad()" ?

Looking forward to your reply, many thanks.

指令输入的位置

请问Read Me里面所给的指令在哪里输入呀?就是python3....那些指令,是在Git Bash上输入吗?

关于训练设置

请问在训练的时候,关于LUT的相关参数设置,以及epoch设置,大概训练了多久呢?另外关于Light-weighted Refinement模块,我也是写了简单的两层卷积,但是在测试的时候发现图片都已经不是正常的拍摄图,内容没变,只是颜色外观扭曲了,所以想问下关于这个模块的详细信息,谢谢

There is no pre-trained model on 1024×1024 HAdobe5k

Or I should also use the model HAdobe5k_2048.pth?I want to look at some data such as the running time and memory cost,I think it should be tested on the same device to make a fair comparison,is that right?

The download link for the Results seems failed

Hi,

It's a nice work. I hope to make use of your results for visualization comparisons, but the current download link seems failed. Would you mind re-opening the download link for test results on both the HAdobe5K and the 100 Real Composite Images?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.