Comments (13)
Hi @yanerzidefanbaba , using git lfs to clone the project may helps. The baidudisk link of the checkpoints is here:
https://pan.baidu.com/s/1Zlr309OcsDuz5FaULQJYWQ passwd:qp6v
from mtia.
thanks for your reply and your exciting work. But the model I trained performed much worse than the one you provided. I think this may be caused by the loss weights. In original vox.yaml, generator_gan,discriminator_gan,feature_matching,bg_fg_mask and fg_mask_concentration were all set to 0. Could you please provide the correct value of these parameters?
from mtia.
The provided config file is the correct setting. Could you provide some training details of your experiments, such as log of losses?
from mtia.
Thanks a ton. My perceptual loss is 115, equivariance_value loss is 0.2249 and equivariance_jacobian is 0.4228. Are these far from your results?
from mtia.
Yes, the result is abnormal. After training the last epoch on the voxceleb dataset, the perceptual loss is usually around 80 and the equivariance loss is around 0.1. Currentlty I'm not sure of the reason. What's your training enviroments and did you make some custom changes to the code?
from mtia.
I decide to train it again cause I am not very sure whether I have changed the training environments. But I remember I didn't change anything except my data are in .mp4 format. Does that affect the outcome of the results?
from mtia.
That's OK. No, it doesn't affect the result but the training speed because of the io limit of the mp4 format. When you change the way to read data, make sure that the image/video is normalized to [0,1].
from mtia.
Thanks for your tips, I will try it again
from mtia.
is it possible to put the checkpoints somewhere else, I'm having an impossible time downloading them through baidu (or git)
from mtia.
@benmcmahan Try with goole drive: https://drive.google.com/file/d/1Fv1ts026-6BwOUTexV1KxwW54pQiK1rY/view?usp=sharing
from mtia.
大佬,我还是复现不出来,perceptual还是在110左右开始收敛,equivariance_value在0.22,epoch在70左右,repeat num=2 我的配置文件如下
dataset_params:
root_dir: /run/media/root/2/vox
frame_shape: [256, 256, 3]
id_sampling: True
pairs_list:
augmentation_params:
flip_param:
horizontal_flip: True
time_flip: True
jitter_param:
brightness: 0.1
contrast: 0.1
saturation: 0.1
hue: 0.1
model_params:
use_bg_predictor: False
common_params:
num_kp: 10
num_channels: 3
estimate_jacobian: True
generator_params:
block_expansion: 64
max_features: 512
num_down_blocks: 2
num_bottleneck_blocks: 6
estimate_occlusion_map: True
skips: True
dense_motion_params:
block_expansion: 64
max_features: 1024
num_blocks: 5
scale_factor: 0.25
discriminator_params:
scales: [1]
block_expansion: 32
max_features: 512
num_blocks: 4
sn: True
train_params:
num_epochs: 200
num_repeats: 1
epoch_milestones: [60, 90]
lr_generator: 2.0e-4
lr_discriminator: 2.0e-4
lr_kp_detector: 2.0e-4
lr_bg_predictor: 2.0e-4
batch_size: 10
scales: [1, 0.5, 0.25, 0.125]
clip_generator_grad: False
clip_kp_detector_grad: True
clip: 1
checkpoint_freq: 5
transform_params:
sigma_affine: 0.05
sigma_tps: 0.005
points_tps: 5
loss_weights:
generator_gan: 0
discriminator_gan: 0
feature_matching: [0, 0, 0, 0]
perceptual: [10, 10, 10, 10, 10]
equivariance_value: 10
equivariance_jacobian: 10
bg_fg_mask: 0
fg_mask_concentration: 0
reconstruction_params:
num_videos: 1000
format: '.mp4'
animate_params:
num_pairs: 50
format: '.mp4'
normalization_params:
adapt_movement_scale: False
use_relative_movement: True
use_relative_jacobian: True
visualizer_params:
kp_size: 5
draw_border: True
colormap: 'gist_rainbow'
MODEL:
default
TAG_PER_JOINT: True
HIDDEN_HEATMAP_DIM: -1
MULTI_TRANSFORMER_DEPTH: [12, 12]
MULTI_TRANSFORMER_HEADS: [16, 16]
MULTI_DIM: [48, 48]
NUM_BRANCHES: 1
BASE_CHANNEL: 32
default
ESTIMATE_JACOBIAN: True
TEMPERATURE: 0.1
DATA_PREPROCESS: False
FIX_IMG2MOTION_ATTENTION: False
INIT_WEIGHTS: False
NAME: pose_tokenpose_b
NUM_JOINTS: 10
PRETRAINED: ''
TARGET_TYPE: gaussian
TRANSFORMER_DEPTH: 12
TRANSFORMER_HEADS: 8
TRANSFORMER_MLP_RATIO: 3
POS_EMBEDDING_TYPE: 'sine-full'
INIT: true
DIM: 192 # 443
PATCH_SIZE:
- 4
- 4
IMAGE_SIZE: - 256
- 256
HEATMAP_SIZE: - 64
- 64
SIGMA: 2
EXTRA:
PRETRAINED_LAYERS:- 'conv1'
- 'bn1'
- 'conv2'
- 'bn2'
- 'layer1'
- 'transition1'
- 'stage2'
- 'transition2'
- 'stage3'
FINAL_CONV_KERNEL: 1
STAGE2:
NUM_MODULES: 1
NUM_BRANCHES: 2
BLOCK: BASIC
NUM_BLOCKS:- 4
- 4
NUM_CHANNELS: - 32
- 64
FUSE_METHOD: SUM
STAGE3:
NUM_MODULES: 4
NUM_BRANCHES: 3
BLOCK: BASIC
NUM_BLOCKS: - 4
- 4
- 4
NUM_CHANNELS: - 32
- 64
- 128
FUSE_METHOD: SUM
代码部分除了train.py里面的nn.DataParallelWithCallback因为pytorch版本不一样换了一个和dataset因为是mp4格式所以改动了一下其余的都没改。不知道是哪里的问题 是epoch还不够吗
from mtia.
@yanerzidefanbaba It seems the training is not enough. The "num_repeats" is set to 150 by default, if set to 1, train 60 epochs is even less than train 1 epoch of the default config. Since the leraning rate is decayed by a factor of 10 at 60 and 90 epochs, so the losses may still look converging.
from mtia.
@benmcmahan Try with goole drive: https://drive.google.com/file/d/1Fv1ts026-6BwOUTexV1KxwW54pQiK1rY/view?usp=sharing
thx, requested access
from mtia.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mtia.