Giter Club home page Giter Club logo

efdm-pytorch's Introduction

[Re] Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization

The official codes of our ML Reproducibility Challenge 2022 report: [Re] Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization

Original Paper: Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization by Yabin Zhang, Minghan Li, Ruihuang Li, Kui Jia, Lei Zhang.

Original Repository: EFDM by YBZh

README file of the original paper are given in OriginalPaper/README.md. The exact steps to reproduce the results can be found in this README file.

The trained weights and training logs are available at this anonymous link

Summary: In this reproducibility study, we present our results and experience during replicating the paper, titled Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization. In real‐world scenarios, the feature distributions are mostly much more complicated than Gaussian, so only mean and standard deviation may not be fully representative to match them. This paper introduces a novel strategy to exactly match the histograms of image features via the Sort‐Matching algorithm in a computationally feasible way. We were able to reproduce most of the results presented in the original paper both qualitatively and quantitatively.

Scope of Reproducibility: In the scope of this study, we aim to reproduce all the qualitative and quantitative results on two tasks, namely Arbitrary Style Transfer (AST) and Domain Generalization (DG). Moreover, we investigate the capability of forming better style representations by EFDM in another recent study.

Below we show a brief implementation of it in PyTorch:

import torch
def exact_feature_distribution_matching(content, style):
    assert (content.size() == style.size()) ## content and style features should share the same shape
    B, C, W, H = content.size(0), content.size(1), content.size(2), content.size(3)
    _, index_content = torch.sort(content.view(B,C,-1))  ## sort content feature
    value_style, _ = torch.sort(style.view(B,C,-1))      ## sort style feature
    inverse_index = index_content.argsort(-1)
    transferred_content = content.view(B,C,-1) + value_style.gather(-1, inverse_index) - content.view(B,C,-1).detach()
    return transferred_content.view(B, C, W, H)

Reproduction Results:

  • Visual comparison of the reported and our produced results on standard (the first two rows) and photo‐realistic (the last two rows) style transfer.

  • Illustration of style interpolation between a single content image and four style images.

  • Average running time of prior methods and proposed strategy used in AST on a 512px image. Note that the compared methods run on a single Tesla V100, while our measurement has been done on a single RTX 2080Ti.
Method Gatys et al. CMD HM AdaIN EFDM EFDM (R)
Time (s) 25.61 19.84 0.33 0.0038 0.0039 0.0043
  • Illustration of the content‐style trade‐off with different λ values.

  • Comparison of reproduced results of different feature distribution matching strategies applied in the original paper.

  • Ablation on trade‐off between content and style loss terms.

  • DG results of category classification on PACS. (R) refers to our reproduced results
Method Art Cartoon Photo Sketch Average
Leave‐one‐domain‐out generalization
R‐18 w/ MixStyle 83.1±0.8 78.6±0.9 95.9±0.4 74.2±2.7 82.9
R‐18 w/ EFDMix 83.9±0.4 79.4±0.7 96.8±0.4 75.0±0.7 83.9
R‐18 w/ EFDMix (R) 80.6±1.5 78.1±0.6 94.1±0.9 72.3±1.2 81.3
R‐18 w/ EFDMix (R) α = 0.5 80.7±1.8 78.2±1.0 94.2±1.3 71.4±1.9 81.3
R‐18 w/ EFDMix (R) α = 1.0 80.9±1.4 78.1±0.9 94.14±1.3 71.4±2.1 81.1
R‐50 w/ MixStyle 90.3±0.3 82.3±0.7 97.7±0.4 74.7±0.7 86.2
R‐50 w/ EFDMix 90.6±0.3 82.5±0.7 98.1±0.2 76.4±1.2 86.9
R‐50 w/ EFDMix (R) 87.4±1.6 81.8±1.6 94.3±2.2 73.7±1.7 84.3
R‐50 w/ EFDMix (R) α = 0.5 87.6±1.7 81.1±1.3 94.5±1.7 73.9±1.5 84.3
R‐50 w/ EFDMix (R) α = 1.0 87.4±2.1 81.6±1.4 94.7±1.6 74.3±1.6 84.5
Single source generalization
R‐18 w/ MixStyle 61.9±2.2 71.5±0.8 41.2±1.8 32.2±4.1 51.7
R‐18 w/ EFDMix 63.2±2.3 73.9±0.7 42.5±1.8 38.1±3.7 54.4
R‐18 w/ EFDMix (R) 63.5±3.4 72.9±1.2 41.9±1.4 36.3±3.1 53.7
R‐18 w/ EFDMix (R) α = 0.5 63.8±2.4 73.2±0.9 42.5±1.6 37.1±3.0 54.2
R‐18 w/ EFDMix (R) α = 1.0 63.7±3.4 73.2±0.9 41.9±1.75 36.3±2.4 53.7
R‐50 w/ MixStyle 73.2±1.1 74.8±1.1 46.0±2.0 40.6±2.0 58.6
R‐50 w/ EFDMix 75.3±0.9 77.4±0.8 48.0±0.9 44.2±2.4 61.2
R‐50 w/ EFDMix (R) 73.0±2.2 77.2±0.9 48.3±1.2 47.7±2.7 61.6
R‐50 w/ EFDMix (R) α = 0.5 73.8±1.6 77.6±1.3 47.9±1.2 46.7±3.2 61.5
R‐50 w/ EFDMix (R) α = 1.0 73.7±1.2 77.8±0.4 47.9±0.7 46.0±4.2 61.4
  • DG results on person re‐ID task. (R) refers to our reproduced results.
Methods MarKet1501 → GRID GRID → MarKet1501
mAP R1 R5 R10 mAP R1 R5 R10
OSNet + MixStyle 33.8±0.9 24.89±1.6 43.7±2.0 53.1±1.6 4.9±0.2 15.4±1.2 28.4±1.3 35.7±0.9
OSNet + EFDMix 35.5±1.8 26.7±3.3 44.4±0.8 53.6±2.0 6.4±0.2 19.9±0.6 34.4±1.0 42.2±0.8
OSNet + EFDMix (R) 35.0±2.6 25.1±2.3 45.6±4.1 52.0±2.9 6.2±0.7 18.8±1.8 33.6±2.7 41.4±2.9
  • White‐balance correction results of the recent methods and its variant with EFDM on mixed‐illuminant evaluation set.
Method MSE↓ ∆E 2000↓
Mean Q1 Q2 Q3 Mean Q1 Q2 Q3
StyleWB 822.77 572.52 840.67 1025.26 11.65 10.63 11.86 13.02
SR + AdaIN 818.99 527.34 875.56 1049.03 11.01 8.64 11.41 12.31
SR + EFDM 761.05 513.96 818.39 969.33 10.16 8.75 9.81 11.69

In our report, We have reproduced the experiments done on two selected tasks, and compared their results with the reported results. Although our experimental results are not identical to the reported ones, we can validate the claims made by the original study according to these results. The source code for reproducing all experiments can be found in EFDM/ArbitraryStyleTransfer, EFDM/DomainGeneralization/imcls, and EFDM/DomainGeneralization/reid, respectively. Our reproduced results can be found in EFDM/ArbitraryStyleTransfer/reproduced_results and EFDM/DomainGeneralization/reproduced_results.

What was easy: The given code in the original repository was easy to follow, and it was well‐written in general. The authors designed the documentation and the source code in a way that anyone who has fundamental knowledge of Python could run the experiments, or even generate their own stylized image from any content.

What was difficult: We would like to add the reproduced outputs by Histogram Matching (HM) along with the others, however the training of HM was based on CPU and the estimated time to complete a single training was around 15 days in our setup. Consequently, we could not include the reproduced outputs by HM to this report. Moreover, it could not be possible to add t‐SNE visualizations to this report, as in the original paper, due to the lack of clarity in the documentation of its script.

To cite the original paper and the reproduction paper in your publications, please use the following bibtex entries:

@inproceedings{zhang2021exact,
  title={Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization},
  author={Zhang, Yabin and Li, Minghan and Li, Ruihuang and Jia, Kui and Zhang, Lei},
  booktitle={CVPR},
  year={2022}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.