Comments (7)
Hi @hanhan0521 , may I ask which config file do you use? Since with current tsv data file, you do not need to use the anno_list
for Tiktok fine-tuning.
from disco.
Hello author, I have solved this problem and found that it exists in the train_captition.tsv file.
I now have another question, which is how do I pre-train my own dataset。I also got the preprocessed output from Openpose。
How to set the following path?Thank you!!!!!!
self.total_num_videos = 340
self.anno_path = 'GIT/{:05d}/labels/{:04d}.txt'
self.image_path = '{:05d}/images/{:04d}.png'
self.anno_pose_path = '{:05d}/openpose_json/{:04d}.png.json'
self.ref_mask_path = '{:05d}/masks/{:04d}.png'
self.image_path_web = '{}/{}'
self.ref_image_path_web = '{}/{}'
self.anno_pose_path_web = '{}/openpose_json/{}.json'
self.ref_mask_path_web = '{}/groundsam/{}.mask.jpg'
self.image_paths_list = []
self.ref_image_paths_list = []
self.ref_pose_paths_list = []
self.anno_list = []
self.anno_pose_list = []
self.anno_init_pose_list = []
self.mask_list = []
ft_video_idx = getattr(args, 'ft_idx', '001_1.57_2.17_1x1') # default elon
if split == 'train': # for training video
# video_idx = ['001_1.57_2.17_1x1'] # elon mask 2
# video_idx = ['007_7.36_7.44_1x1'] # 007
# video_idx = ['001_1.57_2.17_9x16', '001_11.46_11.54_9x16', '001_5.37_5.44_9x16', '001_8.14_8.27_9x16'] # elon mask 1+2+3+4
video_idx = [ft_video_idx] # 007
dataset_prefix = self.args.web_data_root
else: # for pose video
video_idx = [335, 137]
# ref_video_idx = '001_1.57_2.17_1x1' # 007
# ref_video_idx = '007_7.36_7.44_1x1' # 007
ref_video_idx = ft_video_idx # 007
dataset_prefix = self.args.tiktok_data_root
from disco.
Hi @hanhan0521 , for the pre-training, actually you do not need the pose. If you want to pretrain on your own data and do not want to use the tsv format, you may need to revise the dataloader code for the pre-training to use the raw image/mask data. You can refer to this file (https://github.com/Wangt-CN/DisCo/blob/main/dataset/tiktok_controlnet_t2i_imagevar_combine_mask.py). But note that this file is for fine-tuning, therefore it still consists pose that is no use in pretrain.
from disco.
I met the same error
FileNotFoundError: [Errno 2] No such file or directory: 'keli/dataset/TikTok_dataset/GIT/00137/labels/0262.txt'
when I tried to run the command (shown as in Human-Specific Fine-tuning section) with my own settings:
AZFUSE_USE_FUSE=0 NCCL_ASYNC_ERROR_HANDLING=0 CUDA_VISIBLE_DEVICES=0 \
python finetune_sdm_yaml.py \
--cf ./config/ref_attn_clip_combine_controlnet_imgspecific_ft/webtan_S256L16_xformers_upsquare.py \
--pretrained_model ./ft_checkpoint/moretiktok_nocfg/mp_rank_00_model_states.pt \
--root_dir ./run_test \
--ft_idx ./finetune_data/001 \
--log_dir ./exp/human_specific_ft_001/ \
--do_train \
--local_train_batch_size 32 \
--local_eval_batch_size 32 \
--epochs 20 \
--deepspeed \
--eval_step 500 \
--save_step 500 \
--gradient_accumulate_steps 1 \
--learning_rate 1e-3 \
--fix_dist_seed \
--loss_target "noise" \
--unet_unfreeze_type "crossattn" \
--refer_sdvae \
--ref_null_caption False \
--combine_clip_local \
--combine_use_mask \
--conds "poses" "masks" \
--freeze_pose True \
--freeze_background False \
--ft_iters 500 \
--ft_one_ref_image False \
--strong_aug_stage1 True \
--strong_rand_stage2 True
The file structure of models and specific human dataset is:
Disco
├── ft_checkpoint
│ └── moretiktok_nocfg
│ └── mp_rank_00_model_states.pt
├── run_test
│ └── diffusers
│ └── sd-image-variations-diffusers
│ ├── feature_extractor
│ ├── image_encoder
│ ├── safety_checker
│ ├── scheduler
│ ├── unet
│ └── vae
├── finetune_data
│ └── 001 # human specific dataset
│ ├── grounded_sam # ---|
│ ├── openpose_json # |--> get preprocessed data followling steps in
│ ├── openpose_vis # ---| https://github.com/Wangt-CN/DisCo/blob/main/PREPRO.md
│ ├── 0001.png
│ ├── 0002.png
│ └── ......
├── keli
│ └── dataset
│ └── TikTok_dataset
│ ├── 00001
│ │ ├── densepose
│ │ ├── images
│ │ └── masks
│ ├── 00002
│ └── .....
└── .....
How to run human specific fine-tunning with my own dataset ?
How to get the GIT directory which can be found in all ./dataset/tiktok_controlnet_t2i_imagevar_combine_specifcimg*.py ?
from disco.
Hi @quqixun, Thanks a lot for reporting the confusion.
- First, please check if the config file indicates to use the data python file, e.g., https://github.com/Wangt-CN/DisCo/blob/main/dataset/tiktok_controlnet_t2i_imagevar_combine_specifcimg_web_upsquare.py.
- Actually the data in GIT folder only contains the frame-name and only is used in validation (https://github.com/Wangt-CN/DisCo/blob/main/dataset/tiktok_controlnet_t2i_imagevar_combine_specifcimg_web_upsquare.py#L192). For our experiment, we just use TikTok data for validation. I am trying to upload the GIT folder to share on Github.
- But actually, GIT file only contains the frame name. Therefore, as a quick work-around, you can just delete the use of the anno_path and the related variable. And then revise this part to provide a customized frame name.
- Due to the flexibility of current human-specific ft pipeline, there may be many work-arounds to do this. Sorry for the confusion and I will then write a brief intro for easily adaptation to the user data.
from disco.
Hello author, I would like to use our code to achieve the posture of driving the target character with its own skeletal keypoints. But when I run the "[/dataset/tiktok_controlnet_t2i_imagevar_combine_mask.py]" file, the cond folder I get is the pose of its own dataset, and the gt folder is still the original human image, and it is not driven, why is that?
from disco.
@hanhan0521 Hi, actually I cannot understand you question. But from the code you give, it seemed that you set the anno_pose_json
to the Tiktok data. This file contains the skeleton annotation human. BTW, what stage are you trying?
from disco.
Related Issues (20)
- Model checkpoint for temporal module
- Question about the multi-gpu running: 'mpirun -np ...' HOT 7
- Hope for more instruction about the multiple GPU running
- Error when mpirun -np 3? HOT 2
- how to caculate fvd metrics?
- Where can I get the 10K tiktok style test split?
- Questions about image size
- [BUG] a bug in the dataset/tiktok_video_dataset.py
- How can I get "More TikTok-Style Training Data" please? HOT 1
- the code for computing PSNR is wrong HOT 4
- None
- huggingface demo broken HOT 1
- How were these files generated? For validation and training.
- jax and jaxlib latest version problem
- None
- Jimmy HOT 1
- How to use DeepSpeed for multi-GPU training instead of using mpirun?
- Is the TikTok dataset you provide the MoreTikTok dataset?
- 能不能提供多机多卡的运行脚本
- drop_pose_ratio is not defined (temporal module finetuning)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from disco.