Can anyone suggest how to make GPU work with this project?

Nvm my collaborator found a way <a href="https://github.com/hippalectryon-0/CASALIOY/b

Any way can get GPU work? about privategpt HOT 9 CLOSED

imartinez commented on June 2, 2024

Any way can get GPU work?

from privategpt.

Comments (9)

mmike87 commented on June 2, 2024 4

I watched my GPU usage and it was not touched.

from privategpt.

iker-lluvia commented on June 2, 2024 3

I can get it work in Ubuntu 22.04 installing llama-cpp-python with cuBLAS:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.48

If installation fails because it doesn't find CUDA, it's probably because you have to include CUDA install path to PATH environment variable:
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}

Anyways, it only uses lesst than 1 GB of the VRAM on a RTX 2060 with 6 GB, so I don't know if something is still missing.

from privategpt.

shondle commented on June 2, 2024 2

If anyone can't still figure this out, I explained how I got it to work in detail here (issue #217)

from privategpt.

iker-lluvia commented on June 2, 2024 1

Aren't you just emulating the CPU? Idk if there's even working port for GPU support

It shouldn't. The llama.cpp library can perform BLAS acceleration using the CUDA cores of the Nvidia GPU through cuBLAS. I expect llama-cpp-python to do so as well when installing it with cuBLAS.
Any fast way to verify if the GPU is being used other than running nvidia-smi or nvtop?

from privategpt.

su77ungr commented on June 2, 2024 1

Nvm my collaborator found a way see

from privategpt.

walking-octopus commented on June 2, 2024

Chances are, it's already partially using the GPU. As it is now, it's a script linking together LLaMa.cpp emeddings, Chroma vector DB, and GPT4All. GPT4All might be using PyTorch with GPU, Chroma is probably already heavily CPU parallelized, and LLaMa.cpp runs only on the CPU.

It's also worth noting that two LLMs are used with different inference implementations, meaning you may have to load the model twice.

from privategpt.

pabl-o-ce commented on June 2, 2024

this mean that this work only with CPU?

I currently want to try this

Also can give some info on the Readme about the requirements of hardware.

from privategpt.

su77ungr commented on June 2, 2024

No, LlamaCpp was designed to take only CPU resources. For GPU you'd have to use the native Llama model from facebook.

from privategpt.

su77ungr commented on June 2, 2024

Aren't you just emulating the CPU?
Idk if there's even working port for GPU support

from privategpt.

Recommend Projects

Any way can get GPU work? about privategpt HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent