csxmli2016 / w-plus-adapter Goto Github PK

View Code? Open in Web Editor NEW

96.0 96.0 6.0 44.89 MB

[CVPR 2024] When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation

License: Other

Python 84.87% C++ 1.45% Cuda 8.05% Jupyter Notebook 4.79% Shell 0.85%

w-plus-adapter's People

Contributors

Stargazers

Watchers

Forkers

ayankumarbhunia peternara macuper csjyli oozz-sj cempisirgen

w-plus-adapter's Issues

RCA's residual setup

The statement at the end of the WPlusAttnProcessor Class defines the residual connection. Are you defining the initial hidden_states, which is the input from the previous step, as residual, and the hidden_states after W+QKV calculation as the actual functional residual?
"The order of key statements is as follows."
residual = hidden_states
hidden_states = hidden_states + self.scale * wplus_hidden_states
hidden_states = hidden_states + residual

Your work is excellent, thank you sincerely for your response.

How to select attribute directions?

Awesome work! Thanks for the open source. As a fresh, I'd love to know how the attribute directions of these latent space edits are selected? Looking forward to your reply, this is very important to me, thank you very much!

Will the W+ vector affect the reconstruction quality?

"Thank you for your work and response.

If I input the face segmentation mask obtained from a wild image processed by a segmentation model into e4e to get W+, it might improve the editing effect. However, does W+ itself carry other image information? For example, does the W+ vector obtained from the mask affect the image reconstruction quality? Does the reconstruction effect of diffusion models rely solely on VAE?"

The format of the txt file when running train.py.

"Thank you for your response. When I run train_wild.py, I don't understand how the 'f = open('./Face/ffhq_wild_names_with_caption.txt', 'r')' part is supposed to be constructed.
In the txt file I created, it looks like this:
02198_0.png A man wearing a hat and glasses
02201_0.png a young girl wearing a helmet and a purple jacket
Could you provide some examples of the format for this txt file?"
The error message I always receive is similar to "FileNotFoundError: [Errno 2] No such file or directory: './Face/FFHQ512_e4e_nobg_w_plus/00001_0.png A baby in a blue blan.pth'."

inference editing stage

Hello! Thank you very much for your excellent work. As a newcomer, I don't quite understand which script file to run during the inference editing stage. How should I control the direction of attribute editing? Also, my email is [email protected].

The first stage model

Thanks for sharing this amazing work，I am trying to reproduce this work.For the purpose of comparison, could you make the model from the first stage of training publicly available? By the way, the released e4e_ffhq_encode.pt model file is corrupted.

The training debugging parameters are incorrect due to the shell script file.

Thank you for your encouragement all along. When I execute the .sh file directly in my virtual environment, the code trains normally. However, when I debug it in VS Code using the JSON file below, I encounter the following error traceback. I would appreciate any advice you can provide.
The following is the JSON file I modified based on the .sh file for debugging purposes.：
{
"name": "train_face.py",
"type": "debugpy",
"cwd":"/root/autodl-tmp/w-plus-adapter-main",
"request": "launch",
// "program": "${file}",
"program": "train_face.py",
"args": [
"/root/miniconda3/envs/py38w/bin/python",
"/root/.vscode-server/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher",
"56159",
"--",
"accelerate",
"launch",
"--config_file=","8_gpu.json",
"--main_process_port=","25655",
"train_face.py",
"--pretrained_model_name_or_path=","./stable-diffusion-v1-5",
"--data_json_file=","None",
"--data_root_path=","None",
"--mixed_precision=","fp16",
"--resolution=","256",
"--train_batch_size=","1",
"--dataloader_num_workers=","1",
"--learning_rate=","1e-04 ",
"--dataloader_num_workers=","1",
"--weight_decay","0.1",
"--output_dir=","./experiments_stage1_${date_now}",
"--save_steps=","50",
],
"console": "integratedTerminal"
}

=================================================

The error traceback message I received is：
(py38w) root@autodl-container-f7da119752-d5d74c8f:/autodl-tmp/w-plus-adapter-main# cd /root/autodl-tmp/w-plus-adapter-main ; /usr/bin/env /root/miniconda3/envs/py38w/bin/python /root/.vscode-server/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher 56819 -- train_face.py /root/miniconda3/envs/py38w/bin/python /root/.vscode-server/extensions/ms-python.debugpy-2024.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher 56159 -- accelerate launch --config_file= 8_gpu.json --main_process_port= 25655 train_face.py --pretrained_model_name_or_path= ./stable-diffusion-v1-5 --data_json_file= None --data_root_path= None --mixed_precision= fp16 --resolution= 256 --train_batch_size= 1 --dataloader_num_workers= 1 --learning_rate= 1e-04\ --dataloader_num_workers= 1 --weight_decay 0.1 --output_dir= ./experiments_stage1_${date_now} --save_steps= 50
usage: train_face.py [-h] --pretrained_model_name_or_path PRETRAINED_MODEL_NAME_OR_PATH --data_json_file DATA_JSON_FILE --data_root_path DATA_ROOT_PATH
[--output_dir OUTPUT_DIR] [--logging_dir LOGGING_DIR] [--validation_steps VALIDATION_STEPS] [--resolution RESOLUTION]
[--learning_rate LEARNING_RATE] [--weight_decay WEIGHT_DECAY] [--num_train_epochs NUM_TRAIN_EPOCHS]
[--train_batch_size TRAIN_BATCH_SIZE] [--dataloader_num_workers DATALOADER_NUM_WORKERS] [--save_steps SAVE_STEPS]
[--mixed_precision {no,fp16,bf16}] [--report_to REPORT_TO] [--local_rank LOCAL_RANK]
train_face.py: error: the following arguments are required: --pretrained_model_name_or_path, --data_json_file, --data_root_path
(py38w) root@autodl-container-f7da119752-d5d74c8f:/autodl-tmp/w-plus-adapter-main#

===================================================
"--data_json_file" and "--data_root_path" are both set to "None" in both the sh file and the train.py file, and why wasn't the "--pretrained_model_name_or_path" parameter passed in? This is confusing to me. Thank you for any help you can provide.
Alternatively, could you help me write a correct JSON file for debugging during training?

'utilizing BLIP2 to obtain captions

"Thank you very much for your excellent work. Could you please share the code or readme for 'utilizing BLIP2 to obtain captions'?"

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.