Comments (17)
if you are building in Nvidia docker container without actual GPU, you can use something like this:
CUDA_VERSION=$(/usr/local/cuda/bin/nvcc --version | sed -n 's/^.*release \([0-9]\+\.[0-9]\+\).*$/\1/p')
if [[ ${CUDA_VERSION} == 9.0* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;7.0+PTX"
elif [[ ${CUDA_VERSION} == 9.2* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0+PTX"
elif [[ ${CUDA_VERSION} == 10.* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5+PTX"
elif [[ ${CUDA_VERSION} == 11.0* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5;8.0+PTX"
elif [[ ${CUDA_VERSION} == 11.* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5;8.0;8.6+PTX"
else
echo "unsupported cuda version."
exit 1
fi
from extension-cpp.
Tried to investigate a bit this issue since I've faced the same problem in one of my Docker container.
If you're currently running your code through a setup.py , you should first add TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" to run:
python TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" setup.py install
(or an ARG TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" in your Dockerfile for instance )
Additional infos. can be found here: https://pytorch.org/docs/stable/cpp_extension.html
from extension-cpp.
The solution that worked for me on Linux:
The docker requires access to the cuda
library during build time. To ensure this, make sure that
your /etc/docker/daemon.json
file looks as follows:
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
If not, you need to change it and then restart docker with
sudo systemctl restart docker
from extension-cpp.
Hello, for anyone visiting this issue, the problem is caused here : https://github.com/pytorch/pytorch/blob/master/torch/utils/cpp_extension.py#L1694
basically, the arch_list
is supposed to be constructed with discovered architectures with torch.cuda.get_device_capability(i)
The thing is, when no CUDA card is detected, the function torch.cuda.device_count()
returns 0 and thus no architecture is added to that list.
The leads to the last line, which essentially says "add '+PTX' to the name of last architecture, whicvh obviously fails when the arch_list is empty
As such, this problem is essentially because no cuda hardware was found by torch. Possible reasons and solutions:
- driver / cuda mismatch. Probably due to updating of cuda, reboot and driver will be updated
- docker context. See comments above ( #71 (comment) )
If there is no way to detect gpu at build time, but you know what architecture it should run on, you can explicitly set it with environment variable, like said in this comment ( #71 (comment) )
from extension-cpp.
Tried to investigate a bit this issue since I've faced the same problem in one of my Docker container.
If you're currently running your code through a setup.py , you should first add TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" to run:
python TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" setup.py install
(or an ARG TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" in your Dockerfile for instance )
Additional infos. can be found here: https://pytorch.org/docs/stable/cpp_extension.htmlHow to find the "YOUR_GPUs_CC+PTX" of my gpu?
If the gpu driver is loaded correctly, execute the following statement in the python console
>>> torch.cuda.get_device_capability(0)
(6, 1)
that means TORCH_CUDA_ARCH_LIST="6.1"
. However, in most cases, cuda is unavailable because you have specified gpu incorrectly, such as whether you have set CUDA_ VISIBLE_ DEVICES and the specified gpu is not available?
from extension-cpp.
Tried to investigate a bit this issue since I've faced the same problem in one of my Docker container.
If you're currently running your code through a setup.py , you should first add TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" to run:
python TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" setup.py install
(or an ARG TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" in your Dockerfile for instance )
Additional infos. can be found here: https://pytorch.org/docs/stable/cpp_extension.html
How to find the "YOUR_GPUs_CC+PTX" of my gpu?
from extension-cpp.
CUDA_VERSION=$(/usr/local/cuda/bin/nvcc --version | sed -n 's/^.release ([0-9]+.[0-9]+).$/\1/p')
if [[ ${CUDA_VERSION} == 9.0* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;7.0+PTX"
elif [[ ${CUDA_VERSION} == 9.2* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0+PTX"
elif [[ ${CUDA_VERSION} == 10.* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5+PTX"
elif [[ ${CUDA_VERSION} == 11.0* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5;8.0+PTX"
elif [[ ${CUDA_VERSION} == 11.* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5;8.0;8.6+PTX"
else
echo "unsupported cuda version."
exit 1
fi
updated this workaround to support cuda v12:
CUDA_VERSION=$(/usr/local/cuda/bin/nvcc --version | sed -n 's/^.*release \([0-9]\+\.[0-9]\+\).*$/\1/p')
if [[ ${CUDA_VERSION} == 9.0* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;7.0+PTX"
elif [[ ${CUDA_VERSION} == 9.2* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0+PTX"
elif [[ ${CUDA_VERSION} == 10.* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5+PTX"
elif [[ ${CUDA_VERSION} == 11.0* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5;8.0+PTX"
elif [[ ${CUDA_VERSION} == 11.* ]]; then
export TORCH_CUDA_ARCH_LIST="3.5;5.0;6.0;6.1;7.0;7.5;8.0;8.6+PTX"
elif [[ ${CUDA_VERSION} == 12.* ]]; then
export TORCH_CUDA_ARCH_LIST="5.0;5.2;5.3;6.0;6.1;6.2;7.0;7.2;7.5;8.0;8.6;8.7;8.9;9.0+PTX"
else
echo "unsupported cuda version."
exit 1
fi
from extension-cpp.
Tried to investigate a bit this issue since I've faced the same problem in one of my Docker container.
If you're currently running your code through a setup.py , you should first add TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" to run:
python TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" setup.py install
(or an ARG TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" in your Dockerfile for instance )
Additional infos. can be found here: https://pytorch.org/docs/stable/cpp_extension.htmlHow to find the "YOUR_GPUs_CC+PTX" of my gpu?
Have you solved this issue?
from extension-cpp.
from extension-cpp.
I solve this by running: # TORCH_CUDA_ARCH_LIST="6.1+PTX" python setup.py install
for my GTX1080ti. The GPU_CC number6.1 is according to 1080ti refer to https://developer.nvidia.com/cuda-gpus
from extension-cpp.
You should find everything you need on this link (go to section CUDA-Enabled NVIDIA Quadro and NVIDIA RTX)
from extension-cpp.
Is torch.cuda.is_available() False? I have had this only when I try to compile with a broken install of pytorch or cuda.
from extension-cpp.
Is torch.cuda.is_available() False? I have had this only when I try to compile with a broken install of pytorch or cuda.
Which cuda and pytorch version did you use?
from extension-cpp.
I had the same error running in WSL on Windows. The above solutions of setting the TORCH_CUDA_ARCH_LIST environment variable fixed the issue.
from extension-cpp.
how to solve this problem on windows platform @gaetan-landreau @ClementPinard
from extension-cpp.
I got cuda working inside of docker on Windows 10 thanks to the instructions here and a little help from ChatGPT.
The issue is as @earor-R said, you can figure out the TORCH_CUDA_ARCH_LIST
but the GPU still isn't available during docker build
. You can, however, make it available during docker run by adding --gpus=all
.
So you can set up half the Dockerfile automated like
FROM nvidia/cuda:11.7.1-devel-ubuntu22.04
WORKDIR /srv
RUN apt update && apt install -y curl build-essential git
RUN curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > /tmp/miniconda.sh
RUN bash /tmp/miniconda.sh -b -p /opt/miniconda
ENV PATH="/opt/miniconda/bin:$PATH"
RUN pip install torch torchvision torchaudio
RUN git clone https://github.com/oobabooga/text-generation-webui .
RUN mkdir /srv/repositories
RUN cd /srv/repositories && git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda
Then build it:
docker build . -t oobabooga --progress=plain
Then run it, give the container a name, add --gpus all
, and don't add --rm
:
docker run --gpus all -it --name temp-container oobabooga /bin/bash
Then once inside you can get the cuda version like @earor-R said and finish the install:
python -c 'import torch; print(".".join(map(str, torch.cuda.get_device_capability(0))))'
export TORCH_CUDA_ARCH_LIST=="8.6+PTX"
cd /srv/repositories/GPTQ-for-LLaMa && python setup_cuda.py install
Then exit the container and commit it back into an image:
docker commit temp-container oobabooga-run
And then finally you can run it:
docker run -it --gpus=all --rm -p 7860:7860 --mount "type=bind,src=$(wslpath -w text-generation-webui/models),dst=/srv/models,readonly" oobabooga-run python server.py --auto-devices --chat --model=gpt4-x-alpaca-13b-native-4bit-128g --wbits=4 --groupsize=128 --gpu-memory=18 --listen
I wish I could automate the build easier so this is maintainable but that's the best I've got right now.
from extension-cpp.
Tried to investigate a bit this issue since I've faced the same problem in one of my Docker container.
If you're currently running your code through a setup.py , you should first add TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" to run:
python TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" setup.py install
(or an ARG TORCH_CUDA_ARCH_LIST="YOUR_GPUs_CC+PTX" in your Dockerfile for instance )
Additional infos. can be found here: https://pytorch.org/docs/stable/cpp_extension.htmlHow to find the "YOUR_GPUs_CC+PTX" of my gpu?
You can use the next scrip to obtain your GPUs arch:
import torch torch.cuda.get_arch_list()
You will get ['sm_37', 'sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86']
and you will have to parse this into "3.7 5.0 6.0 7.0 7.5 8.0 8.6+PTX"
from extension-cpp.
Related Issues (20)
- www.facebook.com/rozi.khatun.961
- This repo can not compile using Pytorch 1.6.0 HOT 1
- This repo can not compile using Pytorch 1.6.0 HOT 1
- A question about backward propagation speed in cpp extension in tutorial
- Error when building cpp extension HOT 1
- AttributeError: module 'depthwise' has no attribute 'forward'
- How does the layer of C++ extensions translate to TorchScript or onnx? HOT 1
- Cannot Unpack, Too Many Items
- How to debug in cuda-pytorch env?
- Writing a custom C++ extension with variants for both CPU and GPU? HOT 2
- Custom CUDA operator only work well on cuda:0 HOT 3
- How can I access data in cuda kernel like in pytorch?
- error LNK2001 undefined reference to `__cudaRegisterLinkedBinary
- Relation between at::Half and __half
- Deprecation warning
- Scikit-build-core / scikit-build support for binding
- `TORCH_LIBRARY` and `m.def` Not Working as Documented
- [feature request] Instruction on how to setup compile-env for Windows
- JIT-compiling the extension results in non-functional Python module.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from extension-cpp.