csxmli2016 / w-plus-adapter Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2024] When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation
License: Other
[CVPR 2024] When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation
License: Other
The statement at the end of the WPlusAttnProcessor Class defines the residual connection. Are you defining the initial hidden_states, which is the input from the previous step, as residual, and the hidden_states after W+QKV calculation as the actual functional residual?
"The order of key statements is as follows."
residual = hidden_states
hidden_states = hidden_states + self.scale * wplus_hidden_states
hidden_states = hidden_states + residual
Your work is excellent, thank you sincerely for your response.
Awesome work! Thanks for the open source. As a fresh, I'd love to know how the attribute directions of these latent space edits are selected? Looking forward to your reply, this is very important to me, thank you very much!
"Thank you for your work and response.
If I input the face segmentation mask obtained from a wild image processed by a segmentation model into e4e to get W+, it might improve the editing effect. However, does W+ itself carry other image information? For example, does the W+ vector obtained from the mask affect the image reconstruction quality? Does the reconstruction effect of diffusion models rely solely on VAE?"
"Thank you for your response. When I run train_wild.py, I don't understand how the 'f = open('./Face/ffhq_wild_names_with_caption.txt', 'r')' part is supposed to be constructed.
In the txt file I created, it looks like this:
02198_0.png A man wearing a hat and glasses
02201_0.png a young girl wearing a helmet and a purple jacket
Could you provide some examples of the format for this txt file?"
The error message I always receive is similar to "FileNotFoundError: [Errno 2] No such file or directory: './Face/FFHQ512_e4e_nobg_w_plus/00001_0.png A baby in a blue blan.pth'."
Hello! Thank you very much for your excellent work. As a newcomer, I don't quite understand which script file to run during the inference editing stage. How should I control the direction of attribute editing? Also, my email is [email protected].
Thanks for sharing this amazing work,I am trying to reproduce this work.For the purpose of comparison, could you make the model from the first stage of training publicly available? By the way, the released e4e_ffhq_encode.pt model file is corrupted.
Thank you for your encouragement all along. When I execute the .sh file directly in my virtual environment, the code trains normally. However, when I debug it in VS Code using the JSON file below, I encounter the following error traceback. I would appreciate any advice you can provide.
The following is the JSON file I modified based on the .sh file for debugging purposes.:
{
"name": "train_face.py",
"type": "debugpy",
"cwd":"/root/autodl-tmp/w-plus-adapter-main",
"request": "launch",
// "program": "${file}",
"program": "train_face.py",
"args": [
"/root/miniconda3/envs/py38w/bin/python",
"/root/.vscode-server/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher",
"56159",
"--",
"accelerate",
"launch",
"--config_file=","8_gpu.json",
"--main_process_port=","25655",
"train_face.py",
"--pretrained_model_name_or_path=","./stable-diffusion-v1-5",
"--data_json_file=","None",
"--data_root_path=","None",
"--mixed_precision=","fp16",
"--resolution=","256",
"--train_batch_size=","1",
"--dataloader_num_workers=","1",
"--learning_rate=","1e-04 ",
"--dataloader_num_workers=","1",
"--weight_decay","0.1",
"--output_dir=","./experiments_stage1_${date_now}",
"--save_steps=","50",
],
"console": "integratedTerminal"
}
=================================================
The error traceback message I received is:
(py38w) root@autodl-container-f7da119752-d5d74c8f:/autodl-tmp/w-plus-adapter-main# cd /root/autodl-tmp/w-plus-adapter-main ; /usr/bin/env /root/miniconda3/envs/py38w/bin/python /root/.vscode-server/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher 56819 -- train_face.py /root/miniconda3/envs/py38w/bin/python /root/.vscode-server/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher 56159 -- accelerate launch --config_file= 8_gpu.json --main_process_port= 25655 train_face.py --pretrained_model_name_or_path= ./stable-diffusion-v1-5 --data_json_file= None --data_root_path= None --mixed_precision= fp16 --resolution= 256 --train_batch_size= 1 --dataloader_num_workers= 1 --learning_rate= 1e-04\ --dataloader_num_workers= 1 --weight_decay 0.1 --output_dir= ./experiments_stage1_${date_now} --save_steps= 50/autodl-tmp/w-plus-adapter-main#
usage: train_face.py [-h] --pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH --data_json_file DATA_JSON_FILE --data_root_path DATA_ROOT_PATH
[--output_dir OUTPUT_DIR] [--logging_dir LOGGING_DIR] [--validation_steps VALIDATION_STEPS] [--resolution RESOLUTION]
[--learning_rate LEARNING_RATE] [--weight_decay WEIGHT_DECAY] [--num_train_epochs NUM_TRAIN_EPOCHS]
[--train_batch_size TRAIN_BATCH_SIZE] [--dataloader_num_workers DATALOADER_NUM_WORKERS] [--save_steps SAVE_STEPS]
[--mixed_precision {no,fp16,bf16}] [--report_to REPORT_TO] [--local_rank LOCAL_RANK]
train_face.py: error: the following arguments are required: --pretrained_model_name_or_path, --data_json_file, --data_root_path
(py38w) root@autodl-container-f7da119752-d5d74c8f:
===================================================
"--data_json_file" and "--data_root_path" are both set to "None" in both the sh file and the train.py file, and why wasn't the "--pretrained_model_name_or_path" parameter passed in? This is confusing to me. Thank you for any help you can provide.
Alternatively, could you help me write a correct JSON file for debugging during training?
"Thank you very much for your excellent work. Could you please share the code or readme for 'utilizing BLIP2 to obtain captions'?"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.