korrawe / guided-motion-diffusion Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
Hi,
Thansk for the great work! I tried training the motion model with relative data and somewhere during training, I started getting NaN for the loss. To clarify, I'm interested in training GMD's motion model on the original relative representation of HumanM3D dataset. So, I set train_args to card.motion_rel_unet_adagn_xl in train_gmd.py. Note that I am able to train the motion model with absolute-root representation with no issues (when train_args=card.motion_abs_unet_adagn_xl).
Have you faced a similar problem before? I'd appreciate any insights into how to address this issue.
Thanks,
Setareh
Hi authors,
I would like to first appreciate on your interesting work.
I am writing this issue to clarify whether my understanding on your emphasis projection contribution is correct or not, and I really appreciate if you can spend your valuable time to answer my questions.
Firstly, since the dense guidance part in section 4.2 of your paper is for densing the signal, does this mean that, as long as I already have a dense signal (e.g., full trajectory on every frame already), I can leverage section 4.1 only?
Secondly, if I understand correctly, it seems that section 4.1 of you paper happens only during the sampling (inference) period instead of requring a re-training of an existing motion diffusion model. Can I know if my this understanding correct?
Really thanks for your help in advance.
Thanks foy your great work. I use your provided calculate_skating_ratio function to calculate the foot skating ratio of Real, the result is around 0.05. But The paper OmniControl reports that the foot skating ratio of Real is 0.00. Can you provide the calculation results of Real?
Hi Karunratanakul
Thanks for amazing work! I have some parameter questions about your code.
I found that in your args of the provided ckpt the num_frames=60. Did this mean that in your experiments of the paper, the maximum frames is 60? But in your given code, the default num_frames = 224? (I directly run python -m train.train_trajectory, and in my saved args.json, the num_frams=224) I notice that in this file https://github.com/korrawe/guided-motion-diffusion/blob/d4a38acf4256eac195741533e894a289d7a47c15/utils/parser_util.py#L119C16-L119C16, you set the default num_frames=60, however, by changing this default parameters will not lead to the change in final args. I try other parameters like batch_size in TrainingOptions, The parameter will not lead to the change in final args. Could you please check this? I tried a lot and didn't figure out where the training default values come from. Hope that you can help about this.
Thanks in advance
Bests,
Yiqun
Hi,
I'm currently facing an issue while attempting to test "Motion Synthesis". When I run the command:
python -m sample.generate --model_path ./save/unet_adazero_xl_x0_abs_proj10_fp16_clipwd_224/model000500000.pt --text_prompt "a person is walking while raising both hands" --guidance_mode kps
I encounter the following error:
File "/home/user/guided-motion-diffusion/utils/model_util.py", line 110, in get_model_args
'train_keypoint_mask': args.train_keypoint_mask,
AttributeError: 'Namespace' object has no attribute 'train_keypoint_mask'
Could you please provide guidance on how to resolve this issue?
Hi, thanks for your great work!
But as in your README. It seems that I can only output the mp4 files or obj files. Can we output the skeleton files like bvh formats?
Looking forward to your early reply.
Hi again @korrawe !
Can you please help me map the conditioned metrics as reported in the paper (specifically: Traj err
, Loc err
, Avg err
) to those reported in the eval script:
PS - Traj diversity
seems to be missing (from ./save/unet_adazero_xl_x0_abs_proj10_fp16_clipwd_224/eval_humanml_cond_unet_adazero_xl_x0_abs_proj10_fp16_clipwd_224_000500000_gscale2.5_wo_mm.log
), how can I calculate it as well?
Thanks for opening source of your excellent work!
I have a question about the evaluation speed. In original MDM (https://github.com/GuyTevet/motion-diffusion-model), the evaluation cost about 20 hours, where every replication cost about 30 minutes.
However, in your evaluation log, the evaluation is super fast, every replication only costs half a minute. I tried to run the provided evaluation code but the speed is still slow. Could you please tell me how to speed up the evaluation like to match the speed as shown in your log?
Thanks a lot for your patience!!
The evaluation log of MDM:
==================== Replication 0 ====================
Time: 2022-09-21 11:18:01.921946
---> [ground truth] Matching Score: 2.9829
---> [ground truth] R_precision: (top 1): 0.5127 (top 2): 0.7028 (top 3): 0.7901
---> [vald] Matching Score: 5.4813
---> [vald] R_precision: (top 1): 0.3428 (top 2): 0.5137 (top 3): 0.6328
Time: 2022-09-21 11:18:08.139416
---> [ground truth] FID: 0.0016
---> [vald] FID: 0.5877
Time: 2022-09-21 11:18:12.500365
---> [ground truth] Diversity: 9.7076
---> [vald] Diversity: 9.8887
!!! DONE !!!
==================== Replication 1 ====================
Time: 2022-09-21 11:50:54.935330
---> [ground truth] Matching Score: 2.9990
---> [ground truth] R_precision: (top 1): 0.5213 (top 2): 0.7028 (top 3): 0.7920
---> [vald] Matching Score: 5.6097
---> [vald] R_precision: (top 1): 0.3203 (top 2): 0.5107 (top 3): 0.6260
Time: 2022-09-21 11:51:01.926553
---> [ground truth] FID: 0.0017
---> [vald] FID: 0.5371
Time: 2022-09-21 11:51:06.478357
---> [ground truth] Diversity: 9.2401
---> [vald] Diversity: 9.3920
!!! DONE !!!
The evaluation log provided in your repo
==================== Replication 0 ====================
Time: 2023-03-07 19:09:19.460262
---> [ground truth] Matching Score: 2.9721
---> [ground truth] R_precision: (top 1): 0.5013 (top 2): 0.7039 (top 3): 0.7974
---> [vald] Skating Ratio: -1.0000
---> [vald] Matching Score: 5.1525
---> [vald] R_precision: (top 1): 0.3887 (top 2): 0.5850 (top 3): 0.6797
Time: 2023-03-07 19:09:24.482545
---> [ground truth] FID: 0.0016
---> [vald] FID: 0.2199
Time: 2023-03-07 19:09:27.632161
---> [ground truth] Diversity: 9.8112
---> [vald] Diversity: 9.7503
!!! DONE !!!
==================== Replication 1 ====================
Time: 2023-03-07 19:09:39.281943
---> [ground truth] Matching Score: 2.9388
---> [ground truth] R_precision: (top 1): 0.5082 (top 2): 0.7043 (top 3): 0.8019
---> [vald] Skating Ratio: -1.0000
---> [vald] Matching Score: 5.2868
---> [vald] R_precision: (top 1): 0.3672 (top 2): 0.5498 (top 3): 0.6455
Time: 2023-03-07 19:09:43.343890
---> [ground truth] FID: 0.0016
---> [vald] FID: 0.2308
Time: 2023-03-07 19:09:46.924741
---> [ground truth] Diversity: 9.5119
---> [vald] Diversity: 10.1362
!!! DONE !!!
Hi @korrawe !
Similarly to #2 , when running eval , I get:
Traceback (most recent call last):
File "/disk1/guytevet/guided-motion-diffusion/eval/eval_humanml_condition.py", line 445, in <module>
if args.train_keypoint_mask != "none":
AttributeError: 'Namespace' object has no attribute 'train_keypoint_mask'
What should be the fix in that case?
Thanks,
Guy
In the paper, dense signal propagation uses a denosier, the existing DPM model, to solve the keyframe location conditioning task. Does the code also follow what the paper does? I can only find a reward model in the conditioning task, but in the README I didn't see where to download a reward model nor the instructions to generate samples with a reward model. Thanks for correcting me if I missed something.
Hi,
I'm looking at the evaluation code and I'm having a hard time understanding why the absolute data's mean and std are used to denormalize samples from gen_loader which have relative representation (if I understand correctly).
Specificly, in comp_v6_model_dataset.py line 481, gt_poses (sampled from gen_loader) is denormalized using self.dataset.std and self.dataset.mean which are equal to Std_abs_3d.npy and Mean_abs_3d.npy . Shouldn't self.dataset.std_rel and self.dataset.mean_rel be used instead? I tested denormalizing with both of these stats and then visualized the samples (after converting them back to global-xyz representation). The sample normalized with absolute data's mean and std looks better - it walks in a big circle where as the other motion appears to be walking in place with lots of sliding - but I'm not sure if I understand why.
Thanks,
Setareh
Hi @korrawe , Great work!
(1) It seems that when sampling the model, you avoid using the averaged model. Is that true? If so, why?
guided-motion-diffusion/sample/generate.py
Line 175 in e54268d
(2) During training, do you update the optimized model (self.model
) to be the averaged model (self.model_avg
)? If so, where?
I am trying to understand the code used in GMD, and one experiment I have done is (in the default unconditioned generation mode), I replace the model-generated motion with one of the existing motions in the dataset to visualize the motions in the dataset. Specifically, I am running this code in generate.py:
motion = torch.Tensor(np.load('./dataset/HumanML3D/new_joint_vecs_abs_3d/000036.npy')) // loads a 263-dim motion in the HumanML3D dataset
sample = [motion.unsqueeze(0).unsqueeze(1).repeat(10, 1, 1, 1).permute(0, 3, 1, 2) * 10] // reshapes the motion such that it can be processed by sample_to_motion
and then it proceeds to call sample_to_motion on this sample, and render it.
However, when it renders the motion, the motion, while the joints are rotating and moving, the (x, y) position of the motion does not move at all, like the example below.
So my question is, what pre/post-processing goes into the motions after they are generated by the model (aside from std/mean adjustment and inverse random matrix proj)? And why does calling sample_to_motion on one of the existing files in the dataset produce a motion with no movement in the (x, y) direction?
Thank you!
Hi there,
I'm interested in your project and wish to use GMD to generate scenes in the Isaac Gym simulater, which requires translation, rotation and other theta/beta parameters as input. Now I'm able to get the rotation and all theta parameters correct, while there is a trans gap between the meshes I reconstruct and those you provide.
(gray: .obj from GMD; colorful: .obj reconstructed from parameters; visualizer: open3d)
I've noticed that you do something in model/rotation2xyz.py,
# the first translation root at the origin
x_translations = x_translations - x_translations[:, :, [0]]
but after following it, there is still a tiny gap :(
Could you please provide some hints/information about the exact translation of the models?
Thanks,
Lofen Chen
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.