Comments (18)
@MKrupauskas would switching to /libnvidia-container/debian10/amd64
as the source of truth for the package be a solution on your end?
Our intent with the official documentation was to make the downloading the repository list file work across different distributions, but the .list
files would locally refer to the lowest compatible distribution for a given package flavor.
In the Debian case, this is debian10
. The motivation for the changes that are causing the breakages are called out in NVIDIA/nvidia-container-toolkit#89 (comment)
from libnvidia-container.
All these user complaints would be solved with a symlink 😉
from libnvidia-container.
All these user complaints would be solved with a symlink 😉
@jonathanjsimon it's not quite a simple as that. A symlink duplicates the contents of the target folder at the link location when publishing these repos through GitHub pages. The reason this optimisation was performed was that the resultant artifact is already too large, causing the pages deployment to fail meaning that new packages are not available.
We are aware that there may be ways to increase the timeout using custom pages deployments. If you have experience in how to do this, suggestions are welcome.
from libnvidia-container.
While we did work around the issue by pointing our source list to Debian 10 the solution isn't ideal. If the only issue is the artifact size and build timeouts I think we should address that for the sake of having a Debian repo that matches the repo standard and user expectations.
Could you share some logs on what exactly times out if we correctly symlink the distribution directories? Looking at github action docs the steps themselves shouldn't time out for 360m if the default isn't overridden https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepstimeout-minutes
from libnvidia-container.
@MKrupauskas I have made the symlink changes to my personal mirror elezar@98ee43d.
The GitHub actions deploying this is here:
A previous action shows the archive size warning:
The following is an example of a deployment that failed due to a timeout, although this was using the "Deploy from branch" pages deployment and not an explicit workflow as we are using now.
from libnvidia-container.
We have updated our repository structure and installation instructions to make use of generic debian packages. The distribution name no longer affects the instructions.
Please see https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html and reopen this issue if there are still problems.
from libnvidia-container.
Hi there,
when using tools like apt-mirror or apt-mirror2 the file Packages
always is empty after being downloaded from https://nvidia.github.io/libnvidia-container/stable/deb/amd64/Packages, but works in a browser. Do you have any idea where to search for a solution?
from libnvidia-container.
@HenriWahl I don't know what apt-mirror
expects. This is the file tree as deployed to GitHub pages: https://github.com/NVIDIA/libnvidia-container/tree/gh-pages/stable/deb/amd64
If there is additional metadata required by the toolking we could consider adding it.
from libnvidia-container.
@elezar I am not sure what is missing, looks good to me.
The only hint I have that it works with https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64, maybe there is some difference.
Edit: yes, there are some differences:
- https://nvidia.github.io/libnvidia-container/stable/deb/amd64/Packages contains no
Packages.gz
butPackages.xz
- https://nvidia.github.io/libnvidia-container/stable/deb/amd64/Packages has no
Release
andRelease.gpg
files
Edit 2: I found this being an older problem: NVIDIA/nvidia-docker#730
from libnvidia-container.
Those are useful pointers. I will spend some time investigating this.
from libnvidia-container.
I have just tried the following in a clean ubuntu
container:
- Installed
apt-mirror
- Edit
/etc/apt/mirror.list
to only reference:
deb https://nvidia.github.io/libnvidia-container/experimental/deb/amd64 /
- When running
apt-mirror
I then see:
Processing indexes: [Psh: 1: xz: not found
]
- I then installed
xz-utils
:
apt-get install -y xz-utils
- When I now ran
apt-mirror
the repo is mirrored:
$ ls /var/spool/apt-mirror/mirror/
nvidia.github.io
- And in the folders themselves:
ls /var/spool/apt-mirror/mirror/nvidia.github.io/libnvidia-container/experimental/deb/amd64/
Packages libnvidia-container-tools_1.15.0~rc.3-1_amd64.deb nvidia-container-toolkit-base_1.14.0~rc.2-1_amd64.deb
Packages.xz libnvidia-container1-dbg_1.14.0~rc.2-1_amd64.deb nvidia-container-toolkit-base_1.15.0~rc.1-1_amd64.deb
libnvidia-container-dev_1.14.0~rc.2-1_amd64.deb libnvidia-container1-dbg_1.15.0~rc.1-1_amd64.deb nvidia-container-toolkit-base_1.15.0~rc.2-1_amd64.deb
libnvidia-container-dev_1.15.0~rc.1-1_amd64.deb libnvidia-container1-dbg_1.15.0~rc.2-1_amd64.deb nvidia-container-toolkit-base_1.15.0~rc.3-1_amd64.deb
libnvidia-container-dev_1.15.0~rc.2-1_amd64.deb libnvidia-container1-dbg_1.15.0~rc.3-1_amd64.deb nvidia-container-toolkit_1.14.0~rc.2-1_amd64.deb
libnvidia-container-dev_1.15.0~rc.3-1_amd64.deb libnvidia-container1_1.14.0~rc.2-1_amd64.deb nvidia-container-toolkit_1.15.0~rc.1-1_amd64.deb
libnvidia-container-tools_1.14.0~rc.2-1_amd64.deb libnvidia-container1_1.15.0~rc.1-1_amd64.deb nvidia-container-toolkit_1.15.0~rc.2-1_amd64.deb
libnvidia-container-tools_1.15.0~rc.1-1_amd64.deb libnvidia-container1_1.15.0~rc.2-1_amd64.deb nvidia-container-toolkit_1.15.0~rc.3-1_amd64.deb
libnvidia-container-tools_1.15.0~rc.2-1_amd64.deb libnvidia-container1_1.15.0~rc.3-1_amd64.deb
Could you confirm that xz-utils
is installed on your system?
from libnvidia-container.
Hi @elezar - thanks for your investigations!
I can confirm that my apt-mirror
image did NOT have the package xz-utils
installed but now it works WITH it!
Great job! 👍
from libnvidia-container.
@elezar one thing is left: now the apt
command on a client cries that there is no Release
file.
I see it is even missing at https://github.com/NVIDIA/libnvidia-container/tree/gh-pages/stable/deb/amd64.
from libnvidia-container.
From the following documentation: https://wiki.debian.org/DebianRepository/Format#Flat_Repository_Format it is unclear whether a Release
file is actually required. It seems that either InRelease
or Release
must be specified.
Can you give more information on what apt
commands you're using and what the errors are?
from libnvidia-container.
After an apt update
i get this:
Ign:5 https://mirror-apt.local/nvidia-container-toolkit-jammy InRelease
Ign:6 https://mirror-apt.local/nvidia-cuda-jammy InRelease
Err:7 https://mirror-apt.local/nvidia-container-toolkit-jammy Release
404 Not Found [IP: 10.10.10.10 443]
Hit:8 https://mirror-apt.local/nvidia-cuda-jammy Release
Reading package lists... Done
E: The repository 'https://mirror-apt.local/nvidia-container-toolkit-jammy Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
InRealease
and Release
are both getting tried. Meanwhile I found that none of them does exist in my local mirror, as in your listing above.
from libnvidia-container.
Does:
sudo apt-get update --allow-insecure-repositories
work as expected?
from libnvidia-container.
Yes it does.
The problem seems to be caused by apt-mirror, according to apt-mirror/apt-mirror#156. It seems to miss this file on flat repositories. I will look for it or an alternative next week. Thanks for your commitment!
from libnvidia-container.
Yes it does.
The problem seems to be caused by apt-mirror, according to apt-mirror/apt-mirror#156. It seems to miss this file on flat repositories. I will look for it or an alternative next week. Thanks for your commitment!
I think you can get by this by marking the local mirror as trusted or ensuring that the public key for our repos is also downloaded. For example, as per our documentation https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-apt:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
Note that the lines effectively look like:
deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/deb/$(ARCH) /
in this case and setting up something similar for your mirrors would be needed.
from libnvidia-container.
Related Issues (20)
- libnvidia-container ubuntu22.04/amd64 HOT 4
- libnvidia_container fails to compile with mold HOT 3
- Issue in permissions checking in nvcgo/internal/cgroup/ebpf.go ? HOT 2
- nvidia-container-runtime segfault HOT 2
- sudo yum install -y nvidia-container-toolkit failed - No such device
- nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory
- Warning of Key is stored in legacy trusted.gpg keyring HOT 2
- Unprivileged `nvidia-container-cli --user configure`
- ldconfig-free deployment
- Unable to use more than 5 GPU cards HOT 2
- Building libnvidia-container 1.14.5 builds 1.14.4 HOT 19
- nvidia-container-cli: mount error: failed to add device rules: unable to generate new device filter program from existing programs: unable to create new device filters program: load program: invalid argument: 0: (69) r2 = *(u16 *)(r1 +0)
- Trouble Running NVIDIA GPU Containers on Custom Yocto-Based Distro on HPE Server with NVIDIA A40 GPU HOT 5
- How to mirror this Nvidia libnividia rmp repo with artifactory rpm repo HOT 1
- versions.mk and common.mk use PATCH variable for different things
- Support for Ubuntu 24.04 HOT 3
- Error linking when the library version on the host is lower than that in the image
- Setting up nvidia drivers to work with docker container HOT 1
- Ubuntu22.04 make err HOT 1
- Broken link on Unsupported Distribution github page HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libnvidia-container.