Giter Club home page Giter Club logo

Comments (8)

abacaj avatar abacaj commented on July 17, 2024 1

For cases like this I recommend docker because of the environment issues. I have windows as well, here's how I run it.
Use a container like so:

docker run -it --mount type=volume,source=transformers,target=/transformerspython:3.11.4 /bin/bash

Clone the repo:

git clone [email protected]:abacaj/mpt-30B-inference.git

Follow directions in the readme for the rest: https://github.com/abacaj/mpt-30B-inference#setup. I just ran through this process once again now, and it works I can get model to generate correctly on my Ryzen/Windows machine:
image

Thank you. Ive created a conda env, installed the requirements, manually downloaded 2 models (q5_1 and q4_1). Any hint on why this empty responses? I really prefer not to use a container.

Great work by the way!

Likely has to do with ctransformers library, since that is how the bindings work from python -> ggml (though I'm not certain of it)

from mpt-30b-inference.

mzubairumt avatar mzubairumt commented on July 17, 2024 1

Issue Fixed Replace files with
https://github.com/mzubair31102/llama2.git

from mpt-30b-inference.

rodrigofarias-MECH avatar rodrigofarias-MECH commented on July 17, 2024

I'm having the same problem. Processing goes to 100% for a few seconds but returns empty answers. It goes around 24Gb of RAM usage.
I tested in VScode and in cmd. Same behaviour.
Ive tried to debug, but the "generator" variable had no kind of string text inside it.

I'm running mpt-30b-chat.ggmlv0.q5_1.bin model instead of default q4_0.

PC: Ryzen 5900X and 32 Gb RAM.

from mpt-30b-inference.

abacaj avatar abacaj commented on July 17, 2024

For cases like this I recommend docker because of the environment issues. I have windows as well, here's how I run it.

Use a container like so:

docker run -it -w /transformers --mount type=volume,source=transformers,target=/transformers python:3.11.4 /bin/bash

Clone the repo:

git clone [email protected]:abacaj/mpt-30B-inference.git

Follow directions in the readme for the rest: https://github.com/abacaj/mpt-30B-inference#setup.
I just ran through this process once again now, and it works I can get model to generate correctly on my Ryzen/Windows machine:

image

from mpt-30b-inference.

rodrigofarias-MECH avatar rodrigofarias-MECH commented on July 17, 2024

For cases like this I recommend docker because of the environment issues. I have windows as well, here's how I run it.

Use a container like so:

docker run -it --mount type=volume,source=transformers,target=/transformerspython:3.11.4 /bin/bash

Clone the repo:

git clone [email protected]:abacaj/mpt-30B-inference.git

Follow directions in the readme for the rest: https://github.com/abacaj/mpt-30B-inference#setup. I just ran through this process once again now, and it works I can get model to generate correctly on my Ryzen/Windows machine:

image

Thank you.
Ive created a conda env, installed the requirements, manually downloaded 2 models (q5_1 and q4_1). Any hint on why this empty responses? I really prefer not to use a container.

Great work by the way!

from mpt-30b-inference.

mzubairumt avatar mzubairumt commented on July 17, 2024

I have observed that when processing user queries, the CPU usage increases but I do not receive a response.
[user]: What is the capital of France?
[assistant]:
[user]"

from mpt-30b-inference.

mzubairumt avatar mzubairumt commented on July 17, 2024

For cases like this I recommend docker because of the environment issues. I have windows as well, here's how I run it.
Use a container like so:

docker run -it --mount type=volume,source=transformers,target=/transformerspython:3.11.4 /bin/bash

Clone the repo:

git clone [email protected]:abacaj/mpt-30B-inference.git

Follow directions in the readme for the rest: https://github.com/abacaj/mpt-30B-inference#setup. I just ran through this process once again now, and it works I can get model to generate correctly on my Ryzen/Windows machine:
image

Thank you. Ive created a conda env, installed the requirements, manually downloaded 2 models (q5_1 and q4_1). Any hint on why this empty responses? I really prefer not to use a container.

Great work by the way!

python3 inference.py
Fetching 1 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 3584.88it/s]
GGML_ASSERT: /home/runner/work/ctransformers/ctransformers/models/ggml/ggml.c:4103: ctx->mem_buffer != NULL
Aborted

from mpt-30b-inference.

renanfferreira avatar renanfferreira commented on July 17, 2024

I'm also facing this issue on Windows.
However, the main problem is when I run this on a container it produces very slow responses.

from mpt-30b-inference.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.