Giter Club home page Giter Club logo

freedom's People

Contributors

vvictoryuki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

freedom's Issues

Plan on releasing code

Hi,

Thank you very much for the paper, it is super interesting.
Do you know when do you plan on releasing the code?

Best,

wired output

Hi,

Thanks for the interesting work! I tried to re-implement your method using faceID as guidance. With the larger guidance weight, I got this wired output. But with the small weight, the result did not match the given condition.

Do you have similar observation? Any suggestion please? Thank you!

image

Error(s) in loading state_dict for ControlLDM

Thank you for your impressive work.
When I use "FreeDoM-CN-style/faceID" example and run python pose2image.py --seed 1234 --timesteps 100 --prompt "young man, realitic photo" --pose_ref "./test_imgs/pose4.jpg" --id_ref "./test_imgs/id3.png" someting errors happened:

Loaded model config from [./models/cldm_v15.yaml]
Loaded state_dict from [./models/control_sd15_openpose.pth]
Traceback (most recent call last):
  File "/home/user2/models/FreeDoM-main/CN/pose2image.py", line 31, in <module>
    model.load_state_dict(load_state_dict('./models/control_sd15_openpose.pth', location='cuda'))
  File "/home/user2/anaconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ControlLDM:
        Unexpected key(s) in state_dict: "cond_stage_model.transformer.text_model.embeddings.position_ids". 

It seems faild to load SD pre-trained model parameters, then I try to put model in cuda before load parameters.

model = create_model('./models/cldm_v15.yaml').cpu()
model = model.cuda()
model.load_state_dict(load_state_dict('./models/control_sd15_openpose.pth', location='cuda'))

but the another error occurred:

Loaded model config from [./models/cldm_v15.yaml]
Loaded state_dict from [./models/control_sd15_openpose.pth]
Traceback (most recent call last):
  File "/home/user2/models/FreeDoM-main/CN/pose2image.py", line 35, in <module>
    ddim_sampler = DDIMSampler(model, add_condition_mode="face_id", ref_path=args.pose_ref, add_ref_path=args.id_ref, no_freedom=args.no_freedom)
  File "/home/user2/models/FreeDoM-main/CN/cldm/ddim_hacked.py", line 128, in __init__
    self.idloss = IDLoss(ref_path=add_ref_path).cuda()
  File "/home/user2/models/FreeDoM-main/CN/cldm/arcface/model.py", line 12, in __init__
    self.facenet.load_state_dict(torch.load("cldm/arcface/model_ir_se50.pth"))
  File "/home/user2/anaconda3/lib/python3.10/site-packages/torch/serialization.py", line 1028, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/user2/anaconda3/lib/python3.10/site-packages/torch/serialization.py", line 1231, in _legacy_load
    return legacy_load(f)
  File "/home/user2/anaconda3/lib/python3.10/site-packages/torch/serialization.py", line 1117, in legacy_load
    tar.extract('storages', path=tmpdir)
  File "/home/user2/anaconda3/lib/python3.10/tarfile.py", line 2081, in extract
    tarinfo = self.getmember(member)
  File "/home/user2/anaconda3/lib/python3.10/tarfile.py", line 1803, in getmember
    raise KeyError("filename %r not found" % name)
KeyError: "filename 'storages' not found"

I don't know how to solve this. Is it because I'm using the wrong pre-trained model? It is downloaded from lllyasviel/ControlNet at main (huggingface.co).

A question on Eq.(1) in the paper

Thanks for sharing the great work! I'm not sure where the formula is from, I've checked the cited paper "Denoising diffusion
probabilistic models", which presents a different formulation. It's also different from the Langevin dynamics sampling formula, could you please clarify this?

some question about coefficient "rho"

Hello, I am interested in the code you posted, thank you for sharing. What puzzles me is that there is not much discussion of scale factor in the paper.
In SD Style, rho appears to be a learning rate associated with both "grad" and "classification guided effects",as shown below
1692454027621
However, in Face ID, rho is equal to at.sqrt(), as follows:
b130ea19a84525988e6f3db1801c4a5
So, how exactly do we set up RHO, and is there some mathematical theory to support it? Thank you?

Question on Eq. 4

Hi, thanks for sharing your results!

I'm afraid I did not really get how you derived Eq. 4: if I'm not mistaken,
∇ p(c ∣ xₜ) = − λ ∇ ℰ (c, xₜ) + λ ∇ 𝔼 [ ℰ (c, xₜ) ],
where the gradient ∇ is w.r.t. xₜ, and the expectation 𝔼 is over p(c ∣ xₜ).

Why have you decided to ignore the second term? Thank you in advance!

error in loading ckpt file

Thanks for your work so much, It's very inspiring!
I met a problem when running run.sh, the error is [Errno 2] No such file or directory: './models/control_sd15_scribble.pth', and the associated code is follows:

│ /data/0shared/yangling/zheming/FreeDoM/CN/scribble2image.py:29 in <module>                       │
│                                                                                                  │
│    26                                                                                            │
│    27                                                                                            │
│    28 model = create_model('./models/cldm_v15.yaml').cpu()                                       │
│ ❱  29 model.load_state_dict(load_state_dict('./models/control_sd15_scribble.pth', location='cu   │
│    30 model = model.cuda()                                                                       │

I have no idea where to download this, hope that you'll help me with it. Thank you so much!

what does eta = 0.5 mean?

Thanks for sharing the great work! I was reading the code for the project, and one of the things that confused me was what does eta = 0.5(at line 302, denoising.py) mean? I compared this code with Alg 1 and found it was different.

colab demo

can someone make a colab , i am eager to try this

error when running styled image conditioning demo

Thanks for your code so much! The work is quite inspiring.
I met a problem when running styled image conditioning demo (SD_style/run.sh), and got the following error

[Errno 2] No such file or directory: '/workspace/stable-diffusion/intermediates/1_1.png'

I have no idea how to work it out, hope you can help me with it! Thanks a lot!

运行SD_style下的txt2img文件时报错,这个模型无法加载

Traceback (most recent call last):
safety_feature_extractor = AutoFeatureExtractor.from_pretrained(safety_model_id)
File "D:\Python\Anaconda\envs\Sd-style\lib\site-packages\transformers\models\auto\feature_extraction_auto.py", line 270, in from_pretrained
config_dict, _ = FeatureExtractionMixin.get_feature_extractor_dict(pretrained_model_name_or_path, **kwargs)
File "D:\Python\Anaconda\envs\Sd-style\lib\site-packages\transformers\feature_extraction_utils.py", line 443, in get_feature_extractor_dict
raise EnvironmentError(
OSError: Can't load feature extractor for 'CompVis/stable-diffusion-safety-checker'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'CompVis/stable-diffusion-safety-checker' is the correct path to a directory containing a preprocessor_config.json file

hyper-parameters used in algorithm 2

Hi, thanks for the great work!

Just curious what're the hyper-parameters used in the algorithm 2 (image below)? For example, how to set the learning rate and repeat time for each time step? I passed the paper but didn't find any detailed about this. Could you please share them? Thanks!

image

Questions about qualitative Results

Hi, really awesome work! I have read your paper and find that in Table1, you only compare your methods with TediGAN. But as you mentioned in your related work, there are other two better training required methods: ControlNet and T2I adapter. How's the FID and Clip score comparing those two works. In T2I adapter, on Coco dataset and text+sketch, the FID is 16.78. I also measured the FID of ControlNet on Coco with only 1k images, text+sketch, the FID is 6.09. But in Table 1 you have 70.97. So, I'm a little bit confused as your generation results are very good. But from the generated figures, I think your dataset is not Coco. As we are recently working on training efficiency and inference algorithms for ControlNet, if we can replace such training requested process, I think there will be no motivation to work on training efficiency algorithm for ControlNet. Thanks very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.