Comments (15)
One year after... it's finally automated! We decided to use GitLab CI since it gives us more control on what we can do. Example of a pipeline run: https://gitlab.com/nvidia/cuda/pipelines/5876874
With GitLab CI it will also be possible to add our own machines to run GPU tests on the generated images. We already do this internally.
Closing, finally.
from nvidia-docker.
Yes, it's definitely on our list!
from nvidia-docker.
btw. sometimes is easier to setup circleci and push build to hub instead
from nvidia-docker.
Another argument for automated repos would be that for others who create automated repos that happen build from of your image, it become trivial for those same people to enable a triggered build within the docker hub ecosystem. So when the Nvidia image updates with fixes, so to do users', Again the same could be done using web hooks and API calls, but keeping it simple with the docker hub interface makes it pleasant for newer users.
from nvidia-docker.
The Phoronix test suite comes with OpenCL support, so could be useful to do regression-testing for the automated repo: http://www.phoronix.com/scan.php?page=article&item=nvidia-amd-opencl-2015&num=1
from nvidia-docker.
@ruffsl For osrf/ros
it looks like you also have multiple Dockerfiles with dependencies that mandates a specific order of build. How did you setup an automated build with these constraints? All the builds seems to start in parallel and thus I can't create the devel
images properly since they depend on the runtime
images.
@UniqueFool The problem with CI and testing is that I'm not currently aware of an open-source CI solution that would allow us to run GPU tests. We have internal solutions of course, but it will be more complex to integrate to GitHub. I will continue evaluating the solutions.
from nvidia-docker.
You can just run build on CI, without testing:
https://circleci.com/docs/docker#deployment-to-a-docker-registry
from nvidia-docker.
Sure, but it would be more convenient to deploy and test with the same solution. But indeed, the short-term solution could be to only automate the builds for now.
from nvidia-docker.
You need to build it on CircleCI before testing anyway ;) So it's good first step to build + upload first.
from nvidia-docker.
@flx42 , yes I've noticed this. Looking at the build details recording the build logs, I'm seeing the start times for each tag to have been triggered roughly simultaneously, with one of my higher level tags starting first. I'm rather sure the official repos do not suffer the same shortcomings ( although perhaps I've not noticed thanks to how often the upstream Ubuntu image rebuilds and triggers everything else), but I'm uncertain how to invoke the same build order in a single user repo.
I've asked about this before, but was suggested to just re-trigger the build until cascading images reach steady state, I think this is a bit silly. Another approach I first used was to break up my tags into separate repos, like suggested here. This was a bit of a hassle to manage, but did insure that an sequential order is followed. Perhaps cuda runtime and development docker repos could be separate, but the lack of tag level vs repo level triggering would be hampering to further tag specific builds. Let me dig around, perhaps something has come along since I've last looked into this. Pinging @yosifkit or @tianon ?
from nvidia-docker.
I've not seen any change on the Docker Hub that would allow images to depend upon another tag in the same repo. This is one of the reasons that the official images do not use automated builds.
from nvidia-docker.
@ruffsl It looks like it's worse than this. When I start my build using a POST request, all the builds start in parallel and then all the devel
builds immediately fail since they depend on the runtime
images.
Since all the runtime Dockerfiles for 6.5, 7.0 and 7.5 start with these lines:
https://github.com/NVIDIA/nvidia-docker/blob/master/ubuntu-14.04/cuda/7.5/runtime/Dockerfile#L1-L10
My runtime images should be able to share the layers for those commands, but since they are built in parallel, it's not the case (except for the ubuntu layers, obviously):
$ docker history flx42/cuda:7.0-runtime
IMAGE CREATED CREATED BY SIZE COMMENT
[...]
9a4be293a841 19 minutes ago /bin/sh -c #(nop) ENV CUDA_VERSION=7.0 0 B
7410b9a2414b 19 minutes ago /bin/sh -c apt-key adv --fetch-keys http://de 25.66 kB
bac2ad43afa4 19 minutes ago /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be 0 B
18e862dcdeec 19 minutes ago /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841 0 B
62e3850cc26d 19 minutes ago /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI 0 B
89d5d8e8bafb 2 days ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B
e24428725dd6 2 days ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
1796d1c62d0c 2 days ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
0bf056161913 2 days ago /bin/sh -c #(nop) ADD file:9b5ba3935021955492 187.7 MB
$ docker history flx42/cuda:7.5-runtime
IMAGE CREATED CREATED BY SIZE COMMENT
[...]
92aaf1c5e65b 19 minutes ago /bin/sh -c #(nop) ENV CUDA_VERSION=7.5 0 B
83968d3d71cb 19 minutes ago /bin/sh -c apt-key adv --fetch-keys http://de 25.66 kB
ee4242ccf3fd 19 minutes ago /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be 0 B
919a687073ec 19 minutes ago /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841 0 B
04c48fe576ca 19 minutes ago /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI 0 B
89d5d8e8bafb 2 days ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B
e24428725dd6 2 days ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
1796d1c62d0c 2 days ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
0bf056161913 2 days ago /bin/sh -c #(nop) ADD file:9b5ba3935021955492 187.7 MB
Images runtime
and 7.5-runtime
use the same Dockerfile, but for the same reason they have no common layer except from ubuntu. I didn't find a way to output multiple tags from a single automated build.
In my personal github (https://github.com/flx42/nvidia-docker) I modified the devel
images to do FROM flx42/cuda:tag
instead of FROM cuda:tag
. This should allow me to build my devel
images with a second POST request, right?
Well, yes, but it's rebuilding all the images, even the runtime
images. This is costly and it also means that my runtime
images will get overwritten.
My devel
images will build this time, but they will be built on the older runtime
images:
docker history flx42/cuda:7.5-devel
IMAGE CREATED CREATED BY SIZE COMMENT
[...]
92aaf1c5e65b 30 minutes ago /bin/sh -c #(nop) ENV CUDA_VERSION=7.5 0 B
83968d3d71cb 30 minutes ago /bin/sh -c apt-key adv --fetch-keys http://de 25.66 kB
ee4242ccf3fd 30 minutes ago /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be 0 B
919a687073ec 30 minutes ago /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841 0 B
04c48fe576ca 30 minutes ago /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI 0 B
89d5d8e8bafb 2 days ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B
e24428725dd6 2 days ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
1796d1c62d0c 2 days ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
0bf056161913 2 days ago /bin/sh -c #(nop) ADD file:9b5ba3935021955492
The first layers are the same as above for flx42/cuda:7.5-runtime
.
But now flx42/cuda:7.5-runtime
is different:
$ docker pull flx42/cuda:7.5-runtime
7.5-runtime: Pulling from flx42/cuda
[redownloading everything]
$ docker history flx42/cuda:7.5-runtime
IMAGE CREATED CREATED BY SIZE COMMENT
[...]
d6f056622afd 11 minutes ago /bin/sh -c #(nop) ENV CUDA_VERSION=7.5 0 B
52756da1d17b 11 minutes ago /bin/sh -c apt-key adv --fetch-keys http://de 25.66 kB
b179bdd62a38 11 minutes ago /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be 0 B
27ffaa5d4438 11 minutes ago /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841 0 B
3cec27432703 11 minutes ago /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI 0 B
89d5d8e8bafb 2 days ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B
e24428725dd6 2 days ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.895 kB
1796d1c62d0c 2 days ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 194.5 kB
0bf056161913 2 days ago /bin/sh -c #(nop) ADD file:9b5ba3935021955492 187.7 MB
So, we are needlessly duplicating layers that are physically the same. And since everything is rebuilt all the time, the user will have to fetch new layers even when the image they use didn't change.
from nvidia-docker.
@flx42 , you are correct. Given the limitations of the automated build mechanics, it doesn't seem currently possible to host an automated repo on Docker Hub. Like yosifkit mentioned, official images are not built using the same rules, and so once the commits to the cuda dockerfiles settle down, this would be a nice channel to distribute images updated with upstream sources.
This all take wind out of the sails for CI testing the master branch, but I suppose sheerun or UniqueFool suggestions along with the Makefiles you've already written would work well to automate pushing current images to the NVIDIA org repo for public review given triggered events on the master branch.
from nvidia-docker.
Let's give up on the Docker Hub automated repo for now. CI remains an option so I will not close this issue yet.
from nvidia-docker.
@flx42 , On a side note, you may want to keep around these links or put them in the readme/wiki somewhere for others (for at least the ubuntu tags):
I just added something similar for the official ros repo and found it as a nice method for visually verifying parent image lineage.
from nvidia-docker.
Related Issues (20)
- Support Debian 12 HOT 15
- docker.io/nvidia/cuda:11.2-cudnn8.1-ubuntu18.04: not found HOT 1
- installed
- Unable to get container to use nvidia graphics resources. System : Ubuntu22.04 HOT 2
- Manifest unknown error HOT 4
- How to change the jupyter lab host in Pytorch:23.07 image HOT 2
- Can't download cuda on Ubuntu 20.04 HOT 8
- DEEPLINK.GOOGLE.COM
- containerd-config.patch cannot be applied HOT 5
- My wallet, I'd like to receive bnb beacon chain token if possible
- Fails with CUDA driver version is insufficient for CUDA runtime version HOT 1
- No nvidia gpu, docker: Error response from daemon: failed to create shim task: OCI runtime create failed HOT 1
- Failed to acquire license from license server HOT 5
- config.toml created by nvidia-ctk doesn't work properly HOT 10
- can not pull cuda:11.8.0-cudnn8-devel-centos7 HOT 1
- where are the old images? HOT 3
- version
- Unable to find image 'nvidia/cuda:11.0-base' locally when testing nvidia-docker2 setup HOT 1
- Docker run gives : "WSL Environment Detected but no adopters were found" HOT 1
- Splitting up /var/lib/apt/lists/partial/developer.download.nvidia.com_compute_cuda_repos_ubuntu2204_x86%5f64_InRelease into data and signature failed Error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nvidia-docker.