Hi,
I am encountering a mismatch error while attempting to reload the pre-trained weights (SNIP-10dmax) for inference testing on the model. Despite following the installation instructions meticulously to set up the environment and packages, the error arises when executing the command:
!python /content/drive/MyDrive/LLM/Multimodal-Math-Pretraining-main/train.py --reload_model /content/drive/MyDrive/LLM/Multimodal-Math-Pretraining-main/weights/snip-10dmax.pth
to load the weights. The specific error message indicates a size mismatch in several layers:
RuntimeError: Error(s) in loading state_dict for LinearPointEmbedder:
size mismatch for hidden_layers.0.weight: copying a param with shape torch.Size([2112, 2112]) from checkpoint, the shape in current model is torch.Size([384, 384]).
size mismatch for hidden_layers.0.bias: copying a param with shape torch.Size([2112]) from checkpoint, the shape in current model is torch.Size([384]).
size mismatch for fc.weight: copying a param with shape torch.Size([512, 2112]) from checkpoint, the shape in current model is torch.Size([512, 384]).
This issue leads me to question whether there might be a discrepancy between the pre-trained model weights provided and the current model architecture in the repository, or if there were any steps I may have overlooked during the setup process.
I would greatly appreciate any guidance or suggestions you could offer to resolve this mismatch issue. Thank you for your time and assistance.