Giter Club home page Giter Club logo

Comments (10)

PABannier avatar PABannier commented on July 21, 2024 1

Glad that it worked! 2 things to note:

  1. The code for decoding is still a bit messy, i'll refactor it so that it produces clean output.
  2. For the actual output quality, it all depends on your seed. Sometimes I need to run the inference multiple times to have a satisfactory output. I would recommend re-running multiple times for the same prompt.

In any case, thanks again for your issue! I'll clean the README so that instructions are more clear to anyone wanting to try BioGPT.cpp . I'll ping you when I fixed the decoding part so that you can have cleanly-formatted outputs.

from biogpt.cpp.

PABannier avatar PABannier commented on July 21, 2024

Hello @jonahkaye !

Thanks for raising this issue. You should run the make command which generates a main executable.
The command ./main -p "trastuzumab" is correct.

To have more context on your issue, how are you generating the ggml model? Did the conversion process in convert_pt_to_ggml.py go well? Did you try to explicitly pass via the -m flag the path to the ggml weights?

from biogpt.cpp.

jonahkaye avatar jonahkaye commented on July 21, 2024

So I tried using your download script but it doesn't pull the pytorch_model.bin file so I just downloaded that and the other config files from huggingface directly. Then the conversion worked well. I tried passing the path both explicitly and no explicitly and got that error I mentioned above both times.

from biogpt.cpp.

PABannier avatar PABannier commented on July 21, 2024

Ok it does compile and work properly on Mac OS. Yet, I'm able to reproduce your issue on a Ubuntu 18 machine. Investigating...

from biogpt.cpp.

PABannier avatar PABannier commented on July 21, 2024

@jonahkaye The problem was not clear at all. Actually I didn't make it explicitly clear that ones need the perluniprops data (a set of rules used by the Moses tokenizer) for the executable to properly work.

I pushed these perluniprops data into a ./data subdirectory. Normally by pulling main, your issue should be fixed. Could you confirm?

from biogpt.cpp.

jonahkaye avatar jonahkaye commented on July 21, 2024

Hi @PABannier
That solved the previous issue, but now this happens:

(venv) jonahkaye@Jonahs-MacBook-Air biogpt.cpp % ./main -p "trastuzumab"
main: seed = 1684363965
biogpt_model_load: loading model from './ggml_weights/ggml-model.bin'
biogpt_model_load: n_vocab       = 42384
biogpt_model_load: d_ff          = 4096
biogpt_model_load: d_model       = 1024
biogpt_model_load: n_positions   = 1024
biogpt_model_load: n_head        = 16
biogpt_model_load: n_layer       = 24
biogpt_model_load: ftype         = 0
biogpt_model_load: unknown tensor 'shared.weight' in model file
main: failed to load model from './ggml_weights/ggml-model.bin'

I am not sure what that could be caused by. This is what it looks like when I convert to ggml if thats helpful:

python convert_pt_to_ggml.py --dir-model ./weights/Pre-trained-BioGPT/ --out-dir ./ggml_weights Vocab size: 42384 BPE merges size: 40000 Processing variable: shared.weight with shape: (32128, 1024) Processing variable: encoder.embed_tokens.weight with shape: (32128, 1024) Processing variable: encoder.block.0.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.0.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.0.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.0.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight with shape: (32, 16) Processing variable: encoder.block.0.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.0.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.0.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.0.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.1.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.1.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.1.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.1.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.1.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.1.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.1.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.1.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.2.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.2.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.2.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.2.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.2.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.2.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.2.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.2.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.3.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.3.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.3.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.3.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.3.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.3.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.3.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.3.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.4.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.4.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.4.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.4.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.4.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.4.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.4.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.4.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.5.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.5.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.5.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.5.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.5.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.5.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.5.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.5.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.6.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.6.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.6.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.6.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.6.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.6.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.6.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.6.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.7.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.7.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.7.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.7.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.7.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.7.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.7.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.7.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.8.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.8.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.8.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.8.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.8.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.8.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.8.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.8.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.9.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.9.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.9.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.9.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.9.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.9.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.9.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.9.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.10.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.10.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.10.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.10.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.10.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.10.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.10.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.10.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.11.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.11.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.11.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.11.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.11.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.11.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.11.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.11.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.12.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.12.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.12.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.12.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.12.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.12.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.12.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.12.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.13.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.13.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.13.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.13.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.13.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.13.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.13.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.13.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.14.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.14.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.14.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.14.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.14.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.14.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.14.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.14.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.15.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.15.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.15.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.15.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.15.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.15.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.15.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.15.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.16.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.16.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.16.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.16.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.16.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.16.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.16.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.16.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.17.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.17.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.17.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.17.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.17.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.17.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.17.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.17.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.18.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.18.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.18.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.18.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.18.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.18.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.18.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.18.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.19.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.19.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.19.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.19.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.19.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.19.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.19.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.19.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.20.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.20.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.20.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.20.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.20.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.20.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.20.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.20.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.21.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.21.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.21.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.21.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.21.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.21.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.21.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.21.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.22.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.22.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.22.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.22.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.22.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.22.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.22.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.22.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.23.layer.0.SelfAttention.q.weight with shape: (1024, 1024) Processing variable: encoder.block.23.layer.0.SelfAttention.k.weight with shape: (1024, 1024) Processing variable: encoder.block.23.layer.0.SelfAttention.v.weight with shape: (1024, 1024) Processing variable: encoder.block.23.layer.0.SelfAttention.o.weight with shape: (1024, 1024) Processing variable: encoder.block.23.layer.0.layer_norm.weight with shape: (1024,) Processing variable: encoder.block.23.layer.1.DenseReluDense.wi.weight with shape: (4096, 1024) Processing variable: encoder.block.23.layer.1.DenseReluDense.wo.weight with shape: (1024, 4096) Processing variable: encoder.block.23.layer.1.layer_norm.weight with shape: (1024,) Processing variable: encoder.final_layer_norm.weight with shape: (1024,) Done.
Thanks!

from biogpt.cpp.

PABannier avatar PABannier commented on July 21, 2024

Pass the relative path to your model weights with the -m flag.

./main -m ./my/path/ggml-model.bin -p "trastuzumzab"

from biogpt.cpp.

jonahkaye avatar jonahkaye commented on July 21, 2024

Its loading and finding the model I dont't think that is the issue ...

from biogpt.cpp.

PABannier avatar PABannier commented on July 21, 2024

Yes my bad, I looked at the log and it's not the correct weight file that you downloaded. Yours have an encoder and a decoder part, while it should only have a decoder part. The correct files can be downloaded here.

Make sure you download all these files: merges.txt, pytorch_model.bin, vocab.json and config.json.

In any case I'll make the download instructions more clear.

from biogpt.cpp.

jonahkaye avatar jonahkaye commented on July 21, 2024

Hey @PABannier! So that got it to work. The only issue is the output content quality seems pretty low right now. For example:

(venv) jonahkaye@Jonahs-MacBook-Air biogpt.cpp % ./main -p "trastuzumab"
main: seed = 1684414205
biogpt_model_load: loading model from './ggml_weights/ggml-model.bin'
biogpt_model_load: n_vocab       = 42384
biogpt_model_load: d_ff          = 4096
biogpt_model_load: d_model       = 1024
biogpt_model_load: n_positions   = 1024
biogpt_model_load: n_head        = 16
biogpt_model_load: n_layer       = 24
biogpt_model_load: ftype         = 0
main: prompt: 'trastuzumab'
main: number of tokens in prompt = 2, first 8 tokens: 2 16503 

trastuzumab - associated cardiac toxicity has not been clearly defined .  Here we examined cardiac function of patients receiving trastuzumab therapy .  METHODS : A retrospective chart review of all patients who received trastuzumab therapy for breast cancer at our institution from 2005 to 2015 was conducted .  Cardiac function was assessed before treatment , at the completion of trastuzumab therapy , and during a follow - up visit .  RESULTS : A total of 45 patients were included in the study .  The median age of the patients was 59 years .  Twenty - five patients ( 56 % ) received concurrent chemotherapy and trastuzumab .  Of the 45 patients , 34 ( 75 % ) received concomitant anthracyclines .  The median dose of trastuzumab received was 6 mg / kg ( range , 5 - 10 mg / kg ) .  After a median follow - up of 12 months , only 3 patients ( 7 % ) showed worsening of cardiac function .  Two patients with grade 1 cardiac dysfunction received dose reduction .  The third patient with grade 1 cardiac dysfunction received no dose reduction .

That doesn't look like the output you were getting, both in terms of the content and in terms of what is being printed out. So I am concerned that I might still not be ending up with the same ggml file as you.

I want to confirm that I am using the right binary. Here is the shasum of the one I am using:
76e0d0e14ed9c1f3fb6837f7a7498f0024b01dd9 weights/Pre-trained-BioGPT/pytorch_model.bin

And also that the ggml conversion looks right:

Vocab size: 42384 BPE merges size: 40000 Processing variable: biogpt.embed_tokens.weight with shape: (42384, 1024) Processing variable: biogpt.embed_positions.weight with shape: (1026, 1024) Processing variable: biogpt.layers.0.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.0.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.0.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.0.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.0.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.0.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.0.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.0.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.0.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.0.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.0.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.0.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.0.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.0.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.0.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.0.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.1.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.1.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.1.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.1.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.1.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.1.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.1.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.1.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.1.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.1.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.1.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.1.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.1.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.1.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.1.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.1.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.2.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.2.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.2.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.2.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.2.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.2.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.2.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.2.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.2.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.2.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.2.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.2.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.2.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.2.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.2.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.2.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.3.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.3.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.3.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.3.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.3.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.3.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.3.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.3.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.3.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.3.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.3.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.3.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.3.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.3.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.3.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.3.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.4.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.4.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.4.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.4.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.4.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.4.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.4.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.4.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.4.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.4.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.4.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.4.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.4.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.4.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.4.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.4.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.5.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.5.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.5.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.5.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.5.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.5.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.5.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.5.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.5.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.5.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.5.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.5.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.5.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.5.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.5.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.5.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.6.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.6.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.6.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.6.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.6.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.6.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.6.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.6.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.6.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.6.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.6.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.6.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.6.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.6.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.6.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.6.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.7.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.7.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.7.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.7.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.7.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.7.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.7.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.7.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.7.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.7.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.7.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.7.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.7.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.7.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.7.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.7.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.8.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.8.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.8.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.8.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.8.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.8.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.8.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.8.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.8.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.8.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.8.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.8.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.8.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.8.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.8.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.8.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.9.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.9.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.9.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.9.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.9.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.9.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.9.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.9.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.9.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.9.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.9.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.9.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.9.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.9.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.9.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.9.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.10.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.10.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.10.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.10.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.10.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.10.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.10.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.10.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.10.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.10.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.10.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.10.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.10.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.10.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.10.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.10.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.11.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.11.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.11.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.11.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.11.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.11.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.11.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.11.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.11.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.11.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.11.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.11.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.11.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.11.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.11.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.11.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.12.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.12.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.12.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.12.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.12.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.12.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.12.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.12.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.12.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.12.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.12.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.12.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.12.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.12.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.12.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.12.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.13.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.13.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.13.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.13.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.13.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.13.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.13.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.13.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.13.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.13.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.13.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.13.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.13.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.13.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.13.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.13.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.14.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.14.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.14.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.14.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.14.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.14.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.14.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.14.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.14.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.14.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.14.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.14.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.14.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.14.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.14.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.14.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.15.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.15.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.15.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.15.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.15.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.15.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.15.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.15.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.15.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.15.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.15.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.15.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.15.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.15.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.15.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.15.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.16.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.16.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.16.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.16.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.16.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.16.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.16.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.16.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.16.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.16.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.16.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.16.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.16.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.16.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.16.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.16.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.17.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.17.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.17.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.17.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.17.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.17.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.17.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.17.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.17.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.17.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.17.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.17.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.17.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.17.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.17.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.17.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.18.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.18.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.18.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.18.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.18.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.18.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.18.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.18.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.18.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.18.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.18.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.18.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.18.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.18.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.18.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.18.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.19.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.19.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.19.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.19.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.19.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.19.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.19.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.19.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.19.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.19.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.19.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.19.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.19.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.19.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.19.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.19.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.20.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.20.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.20.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.20.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.20.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.20.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.20.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.20.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.20.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.20.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.20.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.20.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.20.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.20.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.20.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.20.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.21.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.21.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.21.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.21.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.21.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.21.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.21.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.21.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.21.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.21.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.21.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.21.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.21.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.21.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.21.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.21.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.22.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.22.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.22.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.22.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.22.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.22.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.22.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.22.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.22.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.22.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.22.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.22.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.22.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.22.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.22.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.22.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.23.self_attn.k_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.23.self_attn.k_proj.bias with shape: (1024,) Processing variable: biogpt.layers.23.self_attn.v_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.23.self_attn.v_proj.bias with shape: (1024,) Processing variable: biogpt.layers.23.self_attn.q_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.23.self_attn.q_proj.bias with shape: (1024,) Processing variable: biogpt.layers.23.self_attn.out_proj.weight with shape: (1024, 1024) Processing variable: biogpt.layers.23.self_attn.out_proj.bias with shape: (1024,) Processing variable: biogpt.layers.23.self_attn_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.23.self_attn_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layers.23.fc1.weight with shape: (4096, 1024) Processing variable: biogpt.layers.23.fc1.bias with shape: (4096,) Processing variable: biogpt.layers.23.fc2.weight with shape: (1024, 4096) Processing variable: biogpt.layers.23.fc2.bias with shape: (1024,) Processing variable: biogpt.layers.23.final_layer_norm.weight with shape: (1024,) Processing variable: biogpt.layers.23.final_layer_norm.bias with shape: (1024,) Processing variable: biogpt.layer_norm.weight with shape: (1024,) Processing variable: biogpt.layer_norm.bias with shape: (1024,) Processing variable: output_projection.weight with shape: (42384, 1024) Done.

Thanks again for all your help with this!

from biogpt.cpp.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.