Giter Club home page Giter Club logo

mpt-30b-inference's People

Contributors

abacaj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mpt-30b-inference's Issues

Please help

Good afternoon, first of all, thank you for your work. It is very interesting. I am a java developer in production so I don't have much expertise in python. When I run the example from your youtube, I will give an error at the end.

Please tell me how much more difficult it would be to apply here rest communication by post request, if not difficult.

below is the description of the error:

(mpt30_final) D:\Develop\NeuronNetwork\Mpt30\mpt_30B_inference>python inference.py
Fetching 1 files: 100%|██████████████████████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "D:\Develop\NeuronNetwork\Mpt30\mpt_30B_inference\inference.py", line 49, in
llm = AutoModelForCausalLM.from_pretrained(
File "C:\Users\j0sch\miniconda3\envs\mpt30_final\lib\site-packages\ctransformers\hub.py", line 157, in from_pretrained
return LLM(
File "C:\Users\j0sch\miniconda3\envs\mpt30_final\lib\site-packages\ctransformers\llm.py", line 203, in init
if not Path(model_path).is_file():
File "C:\Users\j0sch\miniconda3\envs\mpt30_final\lib\pathlib.py", line 958, in new
self = cls._from_parts(args)
File "C:\Users\j0sch\miniconda3\envs\mpt30_final\lib\pathlib.py", line 592, in _from_parts
drv, root, parts = self._parse_args(args)
File "C:\Users\j0sch\miniconda3\envs\mpt30_final\lib\pathlib.py", line 576, in _parse_args
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

mpt30b not showing any response.

during the inference after the user input the model waits for few seconds but does not respond anything just returns empty. I'm using it on dell optiplex 7070 micro with intel core i7 9700t with 8 cores and 32gb ram.
Screenshot (1)

Core dumped

Hello, I got an error as below after executed "python inference.py", How to fix it. Thanks.

Error msg:
image

Memory
image

Disk
image

VGA
image

CPU
image

OSError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found

Hi,

When trying to run inference, I got the following error message:

Downloading (…)feaf43e4/config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.24k/1.24k [00:00<00:00, 8.64MB/s]
Fetching 1 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.38it/s]
Traceback (most recent call last):
  File "/home/matthieu/Deployment/mpt-30B-inference/inference.py", line 49, in <module>
    llm = AutoModelForCausalLM.from_pretrained(
  File "/home/matthieu/anaconda3/envs/mpt_30b_cpu/lib/python3.10/site-packages/ctransformers/hub.py", line 157, in from_pretrained
    return LLM(
  File "/home/matthieu/anaconda3/envs/mpt_30b_cpu/lib/python3.10/site-packages/ctransformers/llm.py", line 206, in __init__
    self._lib = load_library(lib)
  File "/home/matthieu/anaconda3/envs/mpt_30b_cpu/lib/python3.10/site-packages/ctransformers/llm.py", line 102, in load_library
    lib = CDLL(path)
  File "/home/matthieu/anaconda3/envs/mpt_30b_cpu/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /home/matthieu/anaconda3/envs/mpt_30b_cpu/lib/python3.10/site-packages/ctransformers/lib/avx2/libctransformers.so)

I use an AMD® Ryzen threadripper 3960x 24-core processor × 48 on Ubuntu 18.04 LTS.

Thanks for any help!

Wrong os.path

Hello! Kudos for your nice work on the open-source community of LLM's. I'm learning a lot for your findings.

I've tried to run the latest commit on a GCP VM with 32GB of ram.

And I've found this error:

(env) (base) sergiomoreno@production:~/mpt-30B-inference$ python inference.py
/home/sergiomoreno/mpt-30B-inference/models/mpt-30b-chat.ggmlv0.q4_1.bin
Fetching 1 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7752.87it/s]
Traceback (most recent call last):
  File "/home/sergiomoreno/mpt-30B-inference/inference.py", line 51, in <module>
    llm = AutoModelForCausalLM.from_pretrained(
  File "/home/sergiomoreno/mpt-30B-inference/env/lib/python3.10/site-packages/ctransformers/hub.py", line 157, in from_pretrained
    return LLM(
  File "/home/sergiomoreno/mpt-30B-inference/env/lib/python3.10/site-packages/ctransformers/llm.py", line 203, in __init__
    if not Path(model_path).is_file():
  File "/home/sergiomoreno/miniconda3/lib/python3.10/pathlib.py", line 960, in __new__
    self = cls._from_parts(args)
  File "/home/sergiomoreno/miniconda3/lib/python3.10/pathlib.py", line 594, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "/home/sergiomoreno/miniconda3/lib/python3.10/pathlib.py", line 578, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

After looking the files structure, I've saw that folder didn't exist, and had to replace it to this:

    llm = AutoModelForCausalLM.from_pretrained(
        os.path.abspath("models/models--TheBloke--mpt-30B-chat-GGML/snapshots/60df632f84e8b99fa7aeadf314467152be55adf4/mpt-30b-chat.ggmlv0.q4_1.bin"),
        model_type="mpt",

Is it something expected or can we improve it somehow?

Can we use GPU inference?

Thank you for such a great repo. I was wondering if we can use GPU inference for text generation?

MPT No response

I am trying to generate responses based on the input from mpt-30B model and create a API using flask but I am having trouble as it not giving response i.e. the response is empty for each input I am asking. I am using Standard F32s v2 (32 vcpus, 64 GiB memory) with remote access to server.

from flask import Flask, request, Response
from dataclasses import dataclass, asdict
from ctransformers import AutoModelForCausalLM, AutoConfig
import os
import time

app = Flask(name)

@DataClass
class GenerationConfig:
temperature: float
top_k: int
top_p: float
repetition_penalty: float
max_new_tokens: int
seed: int
reset: bool
stream: bool
threads: int
stop: list

def format_prompt(system_prompt: str, user_prompt: str):
system_prompt = f"system\n{system_prompt}\n"
user_prompt = f"user\n{user_prompt}\n"
assistant_prompt = f"assistant\n"
return f"{system_prompt}{user_prompt}{assistant_prompt}"

def generate(
llm: AutoModelForCausalLM,
generation_config: GenerationConfig,
system_prompt: str,
user_input: str,
):
# return llm(
# format_prompt(
# system_prompt,
# user_prompt,
# ),
# **asdict(generation_config),
# )
model_output = llm(
format_prompt(system_prompt, user_input.strip()),
**asdict(generation_config),
)
print("Model output:", model_output)
return model_output

@app.route('/generate', methods=['GET','POST'])
def generate_response_endpoint():
#user_input = request.data.decode('utf-8')
# Load the model and configuration
if request.method == 'GET':
user_input = request.args.get('user_input', '') # Get input from query parameter
elif request.method == 'POST':
user_input = request.data.decode('utf-8')

print("Loading model...")
#config = AutoConfig.from_pretrained("mosaicml/mpt-30b-chat", context_length=8192)
llm = AutoModelForCausalLM.from_pretrained(
    "/home/azureuser/mpt-30B-inference/models/mpt-30b-chat.ggmlv0.q4_1.bin",
    model_type="mpt"
)
print("model Loaded")

system_prompt = "Reply."

generation_config = GenerationConfig(
    temperature=0.2,
    top_k=0,
    top_p=0.9,
    repetition_penalty=1.0,
    max_new_tokens=512, 
    seed=42,
    reset=False,  
    stream=False, 
    threads=int(os.cpu_count() / 2),  # adjust for your CPU
    stop=["", "|<"],
)

generator = generate(llm, generation_config, system_prompt, user_input.strip())
#time.sleep(60)

print(generator)

response = generator

print(response)
return Response(response, content_type='text/plain; charset=utf-8')

if name == "main":
app.run(host='0.0.0.0', port=3002)

iimport requests

while True:
user_input = input("You: ")
if user_input.lower() in ['exit', 'quit']:
print("Exiting...")
break

data = {'user_input': user_input+"Respond to this."}
response = requests.post('http://127.0.0.1:3002/generate', json=data)

if response.status_code == 200:
    assistant_response = response.text
#    assistant_response = assistant_response.replace("You:", "").replace("Assistant:", "").strip()
    print("Assistant:", assistant_response)
else:
    print("Error:", response.status_code)   

but the repsonse which i am getting is empty as can be seen below:-

You: 3+4
Assistant:
You:

Any idea what may be causing this issue here and what can be done to resolve this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.