Some weights of AutoencoderKL were not initialized from the model checkpoint at /path/to/Latte/t2v_required_models/ and are newly initialized because the shapes did not match:,about vchitect/latte

Comments (2)

maxin-cn commented on August 22, 2024

decoder.conv_in.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.conv_in.weight: found shape torch.Size([512, 4, 3, 3]) in the checkpoint and torch.Size([64, 4, 3, 3]) in the model instantiated

decoder.conv_norm_out.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.conv_norm_out.weight: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.conv_out.weight: found shape torch.Size([3, 128, 3, 3]) in the checkpoint and torch.Size([3, 64, 3, 3]) in the model instantiated

decoder.mid_block.attentions.0.group_norm.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.group_norm.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_k.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_k.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

decoder.mid_block.attentions.0.to_out.0.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_out.0.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

decoder.mid_block.attentions.0.to_q.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_q.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

decoder.mid_block.attentions.0.to_v.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_v.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

decoder.mid_block.resnets.0.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.mid_block.resnets.0.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.mid_block.resnets.0.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.mid_block.resnets.1.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.mid_block.resnets.1.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.up_blocks.0.resnets.0.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.up_blocks.0.resnets.0.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.up_blocks.0.resnets.1.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.up_blocks.0.resnets.1.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.conv_in.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.conv_in.weight: found shape torch.Size([128, 3, 3, 3]) in the checkpoint and torch.Size([64, 3, 3, 3]) in the model instantiated

encoder.conv_norm_out.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.conv_norm_out.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.conv_out.weight: found shape torch.Size([8, 512, 3, 3]) in the checkpoint and torch.Size([8, 64, 3, 3]) in the model instantiated

encoder.down_blocks.0.resnets.0.conv1.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.conv1.weight: found shape torch.Size([128, 128, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.down_blocks.0.resnets.0.conv2.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.conv2.weight: found shape torch.Size([128, 128, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.down_blocks.0.resnets.0.norm1.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.norm1.weight: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.norm2.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.norm2.weight: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.group_norm.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.group_norm.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_k.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_k.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

encoder.mid_block.attentions.0.to_out.0.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_out.0.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

encoder.mid_block.attentions.0.to_q.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_q.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

encoder.mid_block.attentions.0.to_v.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_v.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

encoder.mid_block.resnets.0.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.mid_block.resnets.0.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.mid_block.resnets.0.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.mid_block.resnets.1.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.mid_block.resnets.1.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

当我执行命令bash sample/t2v.sh ,出现预训练模型与实际模型形状不匹配的情况，请问这个问题该如何解决呀？谢谢您！

It looks like you used an incorrect pre-trained model when loading the vae model. Please check it.

from latte.

likeatingcake commented on August 22, 2024

decoder.conv_in.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.conv_in.weight: found shape torch.Size([512, 4, 3, 3]) in the checkpoint and torch.Size([64, 4, 3, 3]) in the model instantiated

decoder.conv_norm_out.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.conv_norm_out.weight: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.conv_out.weight: found shape torch.Size([3, 128, 3, 3]) in the checkpoint and torch.Size([3, 64, 3, 3]) in the model instantiated

decoder.mid_block.attentions.0.group_norm.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.group_norm.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_k.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_k.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

decoder.mid_block.attentions.0.to_out.0.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_out.0.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

decoder.mid_block.attentions.0.to_q.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_q.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

decoder.mid_block.attentions.0.to_v.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.attentions.0.to_v.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

decoder.mid_block.resnets.0.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.mid_block.resnets.0.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.mid_block.resnets.0.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.0.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.mid_block.resnets.1.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.mid_block.resnets.1.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.mid_block.resnets.1.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.up_blocks.0.resnets.0.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.up_blocks.0.resnets.0.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.0.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.up_blocks.0.resnets.1.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

decoder.up_blocks.0.resnets.1.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

decoder.up_blocks.0.resnets.1.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.conv_in.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.conv_in.weight: found shape torch.Size([128, 3, 3, 3]) in the checkpoint and torch.Size([64, 3, 3, 3]) in the model instantiated

encoder.conv_norm_out.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.conv_norm_out.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.conv_out.weight: found shape torch.Size([8, 512, 3, 3]) in the checkpoint and torch.Size([8, 64, 3, 3]) in the model instantiated

encoder.down_blocks.0.resnets.0.conv1.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.conv1.weight: found shape torch.Size([128, 128, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.down_blocks.0.resnets.0.conv2.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.conv2.weight: found shape torch.Size([128, 128, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.down_blocks.0.resnets.0.norm1.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.norm1.weight: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.norm2.bias: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.down_blocks.0.resnets.0.norm2.weight: found shape torch.Size([128]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.group_norm.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.group_norm.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_k.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_k.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

encoder.mid_block.attentions.0.to_out.0.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_out.0.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

encoder.mid_block.attentions.0.to_q.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_q.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

encoder.mid_block.attentions.0.to_v.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.attentions.0.to_v.weight: found shape torch.Size([512, 512]) in the checkpoint and torch.Size([64, 64]) in the model instantiated

encoder.mid_block.resnets.0.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.mid_block.resnets.0.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.mid_block.resnets.0.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.0.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.conv1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.conv1.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.mid_block.resnets.1.conv2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.conv2.weight: found shape torch.Size([512, 512, 3, 3]) in the checkpoint and torch.Size([64, 64, 3, 3]) in the model instantiated

encoder.mid_block.resnets.1.norm1.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.norm1.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.norm2.bias: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

encoder.mid_block.resnets.1.norm2.weight: found shape torch.Size([512]) in the checkpoint and torch.Size([64]) in the model instantiated

当我执行命令bash sample/t2v.sh ,出现预训练模型与实际模型形状不匹配的情况，请问这个问题该如何解决呀？谢谢您！

It looks like you used an incorrect pre-trained model when loading the vae model. Please check it.

(latte) yueyc@super-AS-4124GS-TNR:~/Latte$ bash sample/t2v.sh
Using model!
Traceback (most recent call last):
File "/home/yueyc/Latte/sample/sample_t2v.py", line 167, in
main(OmegaConf.load(args.config))
File "/home/yueyc/Latte/sample/sample_t2v.py", line 38, in main
vae = AutoencoderKL.from_pretrained(args.pretrained_model_path, subfolder="vae", torch_dtype=torch.float16).to(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yueyc/anaconda3/envs/latte/lib/python3.11/site-packages/diffusers/models/modeling_utils.py", line 812, in from_pretrained
unexpected_keys = load_model_dict_into_meta(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yueyc/anaconda3/envs/latte/lib/python3.11/site-packages/diffusers/models/modeling_utils.py", line 155, in load_model_dict_into_meta
raise ValueError(
ValueError: Cannot load /home/yueyc/Latte/t2v_required_models/ because decoder.conv_in.bias expected shape tensor(..., device='meta', size=(64,)), but got torch.Size([512]). If you want to instead overwrite randomly initialized weights, please make sure to pass both low_cpu_mem_usage=False and ignore_mismatched_sizes=True. For more information, see also: huggingface/diffusers#1619 (comment) as an example.
之前的代码在加载vae预训练模型时，我添加了 low_cpu_mem_usage=False and `ignore_mismatched_sizes=True这两个参数，但会出现之前提到的警告，如果不添加这两个参数，便会出现上面的错误。

from latte.

Some weights of AutoencoderKL were not initialized from the model checkpoint at /path/to/Latte/t2v_required_models/ and are newly initialized because the shapes did not match: about latte HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent