Giter Club home page Giter Club logo

biogpt.cpp's Introduction

biogpt.cpp

Inference of BioGPT model in pure C/C++.

Description

The main goal of biogpt.cpp is to run the BioGPT model using 4-bit quantization on a MacBook. This is achieved using the ggml library used in llama.cpp or whisper.cpp.


Here is a typical run using BioGPT:

$ ./main -p "trastuzumab"
main: seed = 1684061910
biogpt_model_load: loading model from './ggml_weights/ggml-model.bin'
biogpt_model_load: n_vocab       = 42384
biogpt_model_load: d_ff          = 4096
biogpt_model_load: d_model       = 1024
biogpt_model_load: n_positions   = 1024
biogpt_model_load: n_head        = 16
biogpt_model_load: n_layer       = 24
biogpt_model_load: f16           = 0
biogpt_model_load: ggml ctx size = 1888.36 MB
biogpt_model_load: memory size =   192.00 MB, n_mem = 24576
biogpt_model_load: model size    = 1488.36 MB
main: prompt: 'Trastuzumab'
main: number of tokens in prompt = 4, first 8 tokens: 2 7548 1171 32924

Trastuzumab (Herceptin) is the first-line treatment for HER2-positive breast cancer and is the only agent approved by the
US Food and Drug Administration for the treatment of HER2-positive metastatic breast cancer. In the US, approximately 20 %
of patients with HER2-positive metastatic breast cancer fail to achieve response to first-line treatment with trastuzumab.
This article discusses the mechanisms of trastuzumab resistance , strategies for overcoming trastuzumab resistance, and the
potential role of other targeted therapies. New treatment options for multiple myeloma. The past 2 years have seen
significant advances in the treatment of multiple myeloma, particularly with the introduction of novel agents, particularly
the proteasome inhibitors and immunomodulatory drugs. These new agents are more effective and are associated with fewer
side effects than the older drugs. Their use has improved survival, with recent clinical trials evaluating combination
therapies with novel agents. However, their role in the treatment of multiple myeloma remains unclear and remains to be
evaluated in future clinical trials.

main: mem per token =  4911704 bytes
main:     load time =   456.57 ms
main:   sample time =    23.32 ms
main:  predict time =  4140.06 ms / 20.39 ms per token
main:    total time =  4672.20 ms

Memory requirements and speed

The inference speeds that I get for the different quantized models on my 16GB MacBook M1 Pro are as follows:

Model Size Time / Token
Original 1.5G 20 ms
Q4_0 240M 8 ms
Q4_1 286M 9 ms
Q5_0 265M 10 ms
Q5_1 288M 11 ms
Q8_0 432M 10 ms

Usage

Here are the steps for the BioGPT model.

Get the code

git clone --recursive https://github.com/PABannier/biogpt.cpp.git
cd biogpt.cpp

Prepare data

Download the weights from the Huggingface BioGPT page and place them into a weights folder. Your weights folder should look something like this:

└── weights
    ├── config.json
    ├── merges.txt
    ├── pytorch_model.bin
    └── vocab.json

Then,

python convert.py --dir-model ./weights/ --out-dir ./ggml_weights

Build

mkdir build && cd build
cmake ..
cmake --build . --config Release

Run

$ ./bin/biogpt -h
usage: ./biogpt [options]

options:
  -h, --help            show this help message and exit
  -s SEED, --seed SEED  RNG seed (default: -1)
  -t N, --threads N     number of threads to use during computation (default: 4)
  -p PROMPT, --prompt PROMPT
                        prompt to start generation with (default: random)
  -l LANG               language of the prompt          (default: )
  -n N, --n_predict N   number of tokens to predict (default: 200)
  --top_k N             top-k sampling (default: 40)
  --top_p N             top-p sampling (default: 0.9)
  --temp N              temperature (default: 0.9)
  -b N, --batch_size N  batch size for prompt processing (default: 8)
  -m FNAME, --model FNAME
                        model path (default: ./ggml_weights/ggml-model.bin)

biogpt.cpp's People

Contributors

hffqyd avatar pabannier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

biogpt.cpp's Issues

Confused about running the main executable

Thanks for taking the time to build this! Awesome initiative.

So I'm stuck here because I did this:

mkdir build && cd build
cmake ..
cmake --build . --config Release

I go back to the root project folder and I download the weights into a weights folder and run the convert script

python convert.py --dir-model ./weights/ --out-dir ./ggml_weights

and all is well. I get the ggml_weights folder.

This is now my directory structure:

.
├── biogpt.cpp
├── biogpt.h
├── bpe.cpp
├── bpe.h
├── build
│   ├── bin
│   ├── CMakeCache.txt
│   ├── CMakeFiles
│   ├── cmake_install.cmake
│   ├── compile_commands.json
│   ├── examples
│   ├── ggml
│   └── Makefile
├── CMakeLists.txt
├── convert.py
├── data
│   ├── nonbreaking_prefixes
│   └── perluniprops
├── examples
│   ├── CMakeLists.txt
│   ├── main
│   └── quantize
├── ggml
│   ├── build.zig
│   ├── ci
│   ├── cmake
│   ├── CMakeLists.txt
│   ├── examples
│   ├── ggml.pc.in
│   ├── include
│   ├── LICENSE
│   ├── README.md
│   ├── requirements.txt
│   ├── scripts
│   ├── src
│   └── tests
├── ggml_weights
│   └── ggml-model.bin
├── mosestokenizer.cpp
├── mosestokenizer.h
├── README.md
└── weights
    ├── config.json
    ├── merges.txt
    ├── pytorch_model.bin
    ├── README.md
    └── vocab.json

Then I go to

cd build/bin
./main -p "trastuzumab"                                                                                                                                                                                         15:27:19
terminate called after throwing an instance of 'std::runtime_error'
  what():  Perl Uniprops file not available.
fish: Job 1, './main -p "trastuzumab"' terminated by signal SIGABRT (Abort)

So for some reason the executable doesn't run and it's missing perl uniprops which are already located in your data folder. But it still doesn't work.

What am I doing wrong?

compile failed

compiling using:

make CC=gcc-11 CPP=g++-11 CXX=g++-11 LD=g++-1

failed to create biogpt. only create file main

log info:

I biogpt.cpp build info:
I UNAME_S: Linux
I UNAME_P: x86_64
I UNAME_M: x86_64
I CFLAGS: -I. -O3 -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native
I CXXFLAGS: -I. -O3 -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native
I LDFLAGS:
I CC:
I CXX:

gcc-11 -I. -O3 -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -c ggml.c -o ggml.o
g++-11 -I. -O3 -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c mosestokenizer.cpp -o mosestokenizer.o
g++-11 -I. -O3 -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c bpe.cpp -o bpe.o
g++-11 -I. -O3 -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c biogpt.cpp -o biogpt.o
biogpt.cpp: In function ‘bool biogpt_model_load(const string&, biogpt_model&, biogpt_vocab&, uint8_t)’:
biogpt.cpp:210:13: warning: C++ designated initializers only available with ‘-std=c++20’ or ‘-std=gnu++20’ [-Wpedantic]
210 | .mem_size = ctx_size,
| ^
biogpt.cpp:211:13: warning: C++ designated initializers only available with ‘-std=c++20’ or ‘-std=gnu++20’ [-Wpedantic]
211 | .mem_buffer = NULL,
| ^
biogpt.cpp:212:13: warning: C++ designated initializers only available with ‘-std=c++20’ or ‘-std=gnu++20’ [-Wpedantic]
212 | .no_alloc = false,
| ^
biogpt.cpp:364:89: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 5 has type ‘int64_t’ {aka ‘long int’} [-Wformat=]
364 | fprintf(stderr, "%s: tensor '%s' has wrong shape in model file: got [%lld, %lld], expected [%d, %d]\n",
| ~~~^
| |
| long long int
| %ld
365 | func, name.data(), tensor->ne[0], tensor->ne[1], ne[0], ne[1]);
| ~~~~~~~~~~~~~
| |
| int64_t {aka long int}
biogpt.cpp:364:95: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 6 has type ‘int64_t’ {aka ‘long int’} [-Wformat=]
364 | fprintf(stderr, "%s: tensor '%s' has wrong shape in model file: got [%lld, %lld], expected [%d, %d]\n",
| ~~~^
| |
| long long int
| %ld
365 | func, name.data(), tensor->ne[0], tensor->ne[1], ne[0], ne[1]);
| ~~~~~~~~~~~~~
| |
| int64_t {aka long int}
biogpt.cpp: In function ‘bool biogpt_eval(const biogpt_model&, int, int, const std::vector&, std::vector&, size_t&)’:
biogpt.cpp:596:9: warning: C++ designated initializers only available with ‘-std=c++20’ or ‘-std=gnu++20’ [-Wpedantic]
596 | .mem_size = buf_size,
| ^
biogpt.cpp:597:9: warning: C++ designated initializers only available with ‘-std=c++20’ or ‘-std=gnu++20’ [-Wpedantic]
597 | .mem_buffer = buf,
| ^
biogpt.cpp:598:9: warning: C++ designated initializers only available with ‘-std=c++20’ or ‘-std=gnu++20’ [-Wpedantic]
598 | .no_alloc = false,
| ^
g++-11 -I. -O3 -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -c main.cpp -o main.o
g++-11 -I. -O3 -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native main.o biogpt.o mosestokenizer.o bpe.o ggml.o -o main

runtime_error: Perl Uniprops file not available.

Thanks for your great tool.

I've compiled the biogpt and converted the model to ggml successfully, but cannot run it. When ./bin/biogpt -m path/to/model or just ./bin/biogpt -h, it throwed an error: libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Perl Uniprops file not available..

I checked that the perluniprops folder was in data directory.

I used macos 10.15 on Intel CPU., what should I do to fix this?

Thanks for your help.

No biogpt executable after running make

Hi Pierre,

Awesome idea to do this project. I am trying to get it running but their is no biogpt executable being generated after running make - their is just the main executable. If I try to run ./main -p "trastuzumab" I get an error:

libc++abi: terminating with uncaught exception of type char const*
zsh: abort ./main -p "trastuzumab"

Was wondering what the right way to run this is.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.