shenhanqian / speechdrivestemplates Goto Github PK
View Code? Open in Web Editor NEW[ICCV 2021] The official repo for the paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".
License: MIT License
[ICCV 2021] The official repo for the paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".
License: MIT License
请问可以用obj格式的mesh来驱动吗?
"To ease later research, we pack our processed data including 2d human pose sequences and corresponding audio clips."
Hello, I download the dataset from the link you provide,but I found there is no audio files ,just have npz files.
Should I generate audio files by myself ? I want to use Luo's data to train model .
I saw that the comparison with the baseline in the paper has very good results. Is it possible to provide the code to implement the baseline in the code?
请问什么时候可以公开源码呢?谢谢!
First of all, thank the author for replying to me by email and providing me with some solutions. Now that the problem has been solved, provide an issue for reference.
When I reproduce the code to see the demo effect, I run the following command:python main.py --config_file configs/voice2pose_sdt_bp.yaml --tag luo --demo_input audio1.wav --checkpoint voice2pose_sdt_bp-luo-ep100.pth DATASET.SPEAKER luo
The following error occurred:
FileNotFoundError:[WinError 2]系统找不到指定的文件
I tried many methods, but still could not run the ffmpeg.concat function correctly. Finally, my solution is as follows (give up using ffmpeg.concat and use other methods):
Hi, how much do you think the model is language dependent? or do you think it is more dependent on the sound of the audio? Thank you for the checkpoints, I managed to make it work :)
I'm trying to train xing processed_data from scratch using DDP,
SYS.DISTRIBUTED True
SYS.WORLD_SIZE 4 (4 GPUS)
RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument find_unused_parameters=True
to torch.nn.parallel.DistributedDataParallel
; (2) making sure all forward
function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward
function. Please include the loss function and the structure of the return value of forward
of your module when reporting this issue (e.g. list, dict, iterable).
Hello, I read your great paper recently ,I don't know much about Co-speech generation task, so I have some questions and would like to ask you for advice:
Are there procedures/steps/scripts for training the model on a custom dataset?
Do you think it would be possible to use this model with 3D coordinates as input and output?
Your model is fascinating and I would like to test the model, can you provide the file about checkpoint please?
您好!我们对您提出的方法十分感兴趣。我们看一看实现效果,但是没有找到权重在哪里下载,可以提供一下吗?十分感谢!
I tried the pose2pose training, but the loss seems never converge. And the pose reconstruction is not correct. Dose anything wrong?
Thanks for your Awesome work!
While generating keypoints using openpose, output format is json file and in your code is npy file.
2_1_gen_kpts.py is not completed yet, Are there instructions to reshape keypoints as required in your script?
如题
Awesome work.
Could you share the source video data of Luo and Xing. I found they are not in Speech2Gesture dataset.
Thanks!
How do you suggest to create the SPEECH2GESTURE-dataset ?
We need a csv file and a folder with the images?
Could give some suggestions?
I have a few questions regarding the dataset processing pipeline,
btw there is an error in 3_2_split_train_val_test.py
that you naming the validation samples "val" while the model searches for "dev" labeled records.
python main.py --config_file configs/voice2pose_sdt_bp.yaml
--tag oliver
--demo_input demo_audio.wav
--checkpoint
DATASET.SPEAKER oliver
我是按照这个脚本去生成的,生成也成功了,但是是关键点的视频,不是真人的视频,语音匹配上了,但是没有合成真人的视频,大佬能给解答一下嘛
I run the following command when I want to test the VAE method demo:python main.py --config_file configs/voice2pose_sdt_vae.yaml --tag luo --demo_input audio1.wav --checkpoint checkpoints/voice2pose_sdt_vae-luo-ep100.pth DATASET.SPEAKER luo
Then the following error occured:
`Traceback (most recent call last):
File "main.py", line 73, in
main()
File "main.py", line 69, in main
run(args, cfg)
File "main.py", line 45, in run
pipeline.demo(cfg, exp_tag, args.checkpoint, args.demo_input)
File "D:\SpeechDrivesTemplates\core\pipelines\trainer.py", line 462, in demo
self.base_path = self.setup_experiment(False, exp_tag, checkpoint=checkpoint, demo_input=demo_input)
File "D:\SpeechDrivesTemplates\core\pipelines\trainer.py", line 221, in setup_experiment
self.setup_model(self.cfg, state_dict=checkpoint['model_state_dict'])
File "D:\SpeechDrivesTemplates\core\pipelines\voice2pose.py", line 221, in setup_model
self.model = Voice2PoseModel(cfg, state_dict, self.num_train_samples, self.get_rank()).cuda()
File "D:\SpeechDrivesTemplates\core\pipelines\voice2pose.py", line 48, in init
raise RuntimeError('External code not provide.')
RuntimeError: External code not provide.`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.