Comments (9)
I watched my GPU usage and it was not touched.
from privategpt.
I can get it work in Ubuntu 22.04 installing llama-cpp-python with cuBLAS:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.48
If installation fails because it doesn't find CUDA, it's probably because you have to include CUDA install path to PATH environment variable:
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
Anyways, it only uses lesst than 1 GB of the VRAM on a RTX 2060 with 6 GB, so I don't know if something is still missing.
from privategpt.
If anyone can't still figure this out, I explained how I got it to work in detail here (issue #217)
from privategpt.
Aren't you just emulating the CPU? Idk if there's even working port for GPU support
It shouldn't. The llama.cpp library can perform BLAS acceleration using the CUDA cores of the Nvidia GPU through cuBLAS. I expect llama-cpp-python to do so as well when installing it with cuBLAS.
Any fast way to verify if the GPU is being used other than running nvidia-smi
or nvtop
?
from privategpt.
Nvm my collaborator found a way see
from privategpt.
Chances are, it's already partially using the GPU. As it is now, it's a script linking together LLaMa.cpp emeddings, Chroma vector DB, and GPT4All. GPT4All might be using PyTorch with GPU, Chroma is probably already heavily CPU parallelized, and LLaMa.cpp runs only on the CPU.
It's also worth noting that two LLMs are used with different inference implementations, meaning you may have to load the model twice.
from privategpt.
this mean that this work only with CPU?
I currently want to try this
Also can give some info on the Readme about the requirements of hardware.
from privategpt.
No, LlamaCpp was designed to take only CPU resources. For GPU you'd have to use the native Llama model from facebook.
from privategpt.
Aren't you just emulating the CPU?
Idk if there's even working port for GPU support
from privategpt.
Related Issues (20)
- Logger doesn't log my custom log
- "Make wipe" does not reset the qdrant database HOT 1
- File not present as "ingested file" after uploading with openai configuration
- JPEG files not ingested with the local Ollama recommended setup. HOT 1
- Error occurs when "make run" on Win11 HOT 2
- (e.g., hello-world-python) HOT 1
- unable to install dlib saying cmake must be installed to build dlib and I have cmake installed already HOT 1
- Multiple users / or anded context on PGPT
- Anyone has success at using this to analyse/query json data files
- Cannot access Mistral and the GUI does not recogize the loaded document HOT 2
- About Log Files HOT 1
- Llama-CPP NVIDIA GPU support problem HOT 3
- Failed to import transformers.models.auto because of the following error。cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' HOT 1
- How can we turn our IP address and port link into a public link using Gradio's sharing feature?
- [FIX] Error: "Initial token count exceeds token limit"
- The stop and submit buttons to interrupt QA interactions spawns segmentation fault
- Groq API in PrivateGPT HOT 2
- Add Simplified Documentation for Getting up and Running for Beginner/Less Technical Users
- "ValueError: Provided model path does not exist. Please check the path or provide a modelurl to download."
- Server fails if it receives 2 prompts simultaneously
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from privategpt.