Hello author, I would like to ask, how to get the “self.anno_path = 'GIT/{:05d}/labels

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I met the same error <div class="highlight highlight-source-shell notranslate posi

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

'GIT/{:05d}/labels/{:04d}.txt' How to get this file？ about disco HOT 7 CLOSED

hanhan0521 commented on June 6, 2024 1

'GIT/{:05d}/labels/{:04d}.txt' How to get this file？

from disco.

Comments (7)

Wangt-CN commented on June 6, 2024

Hi @hanhan0521 , may I ask which config file do you use? Since with current tsv data file, you do not need to use the anno_list for Tiktok fine-tuning.

from disco.

hanhan0521 commented on June 6, 2024

Hello author, I have solved this problem and found that it exists in the train_captition.tsv file.
I now have another question, which is how do I pre-train my own dataset。I also got the preprocessed output from Openpose。
How to set the following path？Thank you！！！！！！

self.total_num_videos = 340
self.anno_path = 'GIT/{:05d}/labels/{:04d}.txt'
self.image_path = '{:05d}/images/{:04d}.png'
self.anno_pose_path = '{:05d}/openpose_json/{:04d}.png.json'
self.ref_mask_path = '{:05d}/masks/{:04d}.png'

    self.image_path_web = '{}/{}'
    self.ref_image_path_web = '{}/{}'
    self.anno_pose_path_web = '{}/openpose_json/{}.json'
    self.ref_mask_path_web = '{}/groundsam/{}.mask.jpg'

    self.image_paths_list = []
    self.ref_image_paths_list = []
    self.ref_pose_paths_list = []
    self.anno_list = []
    self.anno_pose_list = []
    self.anno_init_pose_list = []
    self.mask_list = []

    ft_video_idx = getattr(args, 'ft_idx', '001_1.57_2.17_1x1') # default elon
    if split == 'train': # for training video
        # video_idx = ['001_1.57_2.17_1x1'] # elon mask 2
        # video_idx = ['007_7.36_7.44_1x1'] # 007
        # video_idx = ['001_1.57_2.17_9x16', '001_11.46_11.54_9x16', '001_5.37_5.44_9x16', '001_8.14_8.27_9x16']  # elon mask 1+2+3+4
        video_idx = [ft_video_idx] # 007
        dataset_prefix = self.args.web_data_root
    else: # for pose video
        video_idx = [335, 137]
        # ref_video_idx = '001_1.57_2.17_1x1' # 007
        # ref_video_idx = '007_7.36_7.44_1x1' # 007
        ref_video_idx = ft_video_idx # 007
        dataset_prefix = self.args.tiktok_data_root

from disco.

Wangt-CN commented on June 6, 2024

Hi @hanhan0521 , for the pre-training, actually you do not need the pose. If you want to pretrain on your own data and do not want to use the tsv format, you may need to revise the dataloader code for the pre-training to use the raw image/mask data. You can refer to this file (https://github.com/Wangt-CN/DisCo/blob/main/dataset/tiktok_controlnet_t2i_imagevar_combine_mask.py). But note that this file is for fine-tuning, therefore it still consists pose that is no use in pretrain.

from disco.

quqixun commented on June 6, 2024

I met the same error

FileNotFoundError: [Errno 2] No such file or directory: 'keli/dataset/TikTok_dataset/GIT/00137/labels/0262.txt'

when I tried to run the command (shown as in Human-Specific Fine-tuning section) with my own settings:

AZFUSE_USE_FUSE=0 NCCL_ASYNC_ERROR_HANDLING=0 CUDA_VISIBLE_DEVICES=0 \
python finetune_sdm_yaml.py                                          \
    --cf                        ./config/ref_attn_clip_combine_controlnet_imgspecific_ft/webtan_S256L16_xformers_upsquare.py \
    --pretrained_model          ./ft_checkpoint/moretiktok_nocfg/mp_rank_00_model_states.pt \
    --root_dir                  ./run_test                   \
    --ft_idx                    ./finetune_data/001          \
    --log_dir                   ./exp/human_specific_ft_001/ \
    --do_train                                               \
    --local_train_batch_size    32                           \
    --local_eval_batch_size     32                           \
    --epochs                    20                           \
    --deepspeed                                              \
    --eval_step                 500                          \
    --save_step                 500                          \
    --gradient_accumulate_steps 1                            \
    --learning_rate             1e-3                         \
    --fix_dist_seed                                          \
    --loss_target               "noise"                      \
    --unet_unfreeze_type        "crossattn"                  \
    --refer_sdvae                                            \
    --ref_null_caption          False                        \
    --combine_clip_local                                     \
    --combine_use_mask                                       \
    --conds                     "poses" "masks"              \
    --freeze_pose               True                         \
    --freeze_background         False                        \
    --ft_iters                  500                          \
    --ft_one_ref_image          False                        \
    --strong_aug_stage1         True                         \
    --strong_rand_stage2        True

The file structure of models and specific human dataset is:

Disco
├── ft_checkpoint
│   └── moretiktok_nocfg
│       └── mp_rank_00_model_states.pt
├── run_test
│   └── diffusers
│       └── sd-image-variations-diffusers
│           ├── feature_extractor
│           ├── image_encoder
│           ├── safety_checker
│           ├── scheduler
│           ├── unet
│           └── vae
├── finetune_data                                                                                                                                                                                                     
│   └── 001                # human specific dataset                                                                                                                                                                             
│       ├── grounded_sam   # ---|                                                                                                                                                                        
│       ├── openpose_json  #    |--> get preprocessed data followling steps in                                                                                                                                                   
│       ├── openpose_vis   # ---|    https://github.com/Wangt-CN/DisCo/blob/main/PREPRO.md
│       ├── 0001.png
│       ├── 0002.png
│       └── ......
├── keli
│   └── dataset
│       └── TikTok_dataset                                                                                                                                                                                          
│           ├── 00001                                                                                                                                                                                               
│           │   ├── densepose                                                                                                                                                                                       
│           │   ├── images                                                                                                                                                                                          
│           │   └── masks
│           ├── 00002
│           └── .....
└── .....

How to run human specific fine-tunning with my own dataset ?
How to get the GIT directory which can be found in all ./dataset/tiktok_controlnet_t2i_imagevar_combine_specifcimg*.py ?

from disco.

Wangt-CN commented on June 6, 2024

Hi @quqixun, Thanks a lot for reporting the confusion.

First, please check if the config file indicates to use the data python file, e.g., https://github.com/Wangt-CN/DisCo/blob/main/dataset/tiktok_controlnet_t2i_imagevar_combine_specifcimg_web_upsquare.py.
Actually the data in GIT folder only contains the frame-name and only is used in validation (https://github.com/Wangt-CN/DisCo/blob/main/dataset/tiktok_controlnet_t2i_imagevar_combine_specifcimg_web_upsquare.py#L192). For our experiment, we just use TikTok data for validation. I am trying to upload the GIT folder to share on Github.
But actually, GIT file only contains the frame name. Therefore, as a quick work-around, you can just delete the use of the anno_path and the related variable. And then revise this part to provide a customized frame name.
Due to the flexibility of current human-specific ft pipeline, there may be many work-arounds to do this. Sorry for the confusion and I will then write a brief intro for easily adaptation to the user data.

from disco.

hanhan0521 commented on June 6, 2024

Hello author, I would like to use our code to achieve the posture of driving the target character with its own skeletal keypoints. But when I run the "[/dataset/tiktok_controlnet_t2i_imagevar_combine_mask.py]" file, the cond folder I get is the pose of its own dataset, and the gt folder is still the original human image, and it is not driven, why is that?

from disco.

Wangt-CN commented on June 6, 2024

@hanhan0521 Hi, actually I cannot understand you question. But from the code you give, it seemed that you set the anno_pose_json to the Tiktok data. This file contains the skeleton annotation human. BTW, what stage are you trying?

from disco.

'GIT/{:05d}/labels/{:04d}.txt' How to get this file？ about disco HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent