Thanks for making an cu

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

You can just run build on CI, without testing: <a href="https://circleci.com/docs/

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Making the nvidia/cuda automated repo about nvidia-docker HOT 15 CLOSED

nvidia commented on May 20, 2024

Making the nvidia/cuda automated repo

from nvidia-docker.

Comments (15)

flx42 commented on May 20, 2024 2

One year after... it's finally automated! We decided to use GitLab CI since it gives us more control on what we can do. Example of a pipeline run: https://gitlab.com/nvidia/cuda/pipelines/5876874
With GitLab CI it will also be possible to add our own machines to run GPU tests on the generated images. We already do this internally.
Closing, finally.

from nvidia-docker.

flx42 commented on May 20, 2024

Yes, it's definitely on our list!

from nvidia-docker.

sheerun commented on May 20, 2024

btw. sometimes is easier to setup circleci and push build to hub instead

from nvidia-docker.

ruffsl commented on May 20, 2024

Another argument for automated repos would be that for others who create automated repos that happen build from of your image, it become trivial for those same people to enable a triggered build within the docker hub ecosystem. So when the Nvidia image updates with fixes, so to do users', Again the same could be done using web hooks and API calls, but keeping it simple with the docker hub interface makes it pleasant for newer users.

from nvidia-docker.

UniqueFool commented on May 20, 2024

The Phoronix test suite comes with OpenCL support, so could be useful to do regression-testing for the automated repo: http://www.phoronix.com/scan.php?page=article&item=nvidia-amd-opencl-2015&num=1

from nvidia-docker.

flx42 commented on May 20, 2024

@ruffsl For osrf/ros it looks like you also have multiple Dockerfiles with dependencies that mandates a specific order of build. How did you setup an automated build with these constraints? All the builds seems to start in parallel and thus I can't create the devel images properly since they depend on the runtime images.

@UniqueFool The problem with CI and testing is that I'm not currently aware of an open-source CI solution that would allow us to run GPU tests. We have internal solutions of course, but it will be more complex to integrate to GitHub. I will continue evaluating the solutions.

from nvidia-docker.

sheerun commented on May 20, 2024

You can just run build on CI, without testing:
https://circleci.com/docs/docker#deployment-to-a-docker-registry

from nvidia-docker.

flx42 commented on May 20, 2024

Sure, but it would be more convenient to deploy and test with the same solution. But indeed, the short-term solution could be to only automate the builds for now.

from nvidia-docker.

sheerun commented on May 20, 2024

You need to build it on CircleCI before testing anyway ;) So it's good first step to build + upload first.

from nvidia-docker.

ruffsl commented on May 20, 2024

@flx42 , yes I've noticed this. Looking at the build details recording the build logs, I'm seeing the start times for each tag to have been triggered roughly simultaneously, with one of my higher level tags starting first. I'm rather sure the official repos do not suffer the same shortcomings ( although perhaps I've not noticed thanks to how often the upstream Ubuntu image rebuilds and triggers everything else), but I'm uncertain how to invoke the same build order in a single user repo.

I've asked about this before, but was suggested to just re-trigger the build until cascading images reach steady state, I think this is a bit silly. Another approach I first used was to break up my tags into separate repos, like suggested here. This was a bit of a hassle to manage, but did insure that an sequential order is followed. Perhaps cuda runtime and development docker repos could be separate, but the lack of tag level vs repo level triggering would be hampering to further tag specific builds. Let me dig around, perhaps something has come along since I've last looked into this. Pinging @yosifkit or @tianon ?

from nvidia-docker.

yosifkit commented on May 20, 2024

I've not seen any change on the Docker Hub that would allow images to depend upon another tag in the same repo. This is one of the reasons that the official images do not use automated builds.

from nvidia-docker.

flx42 commented on May 20, 2024

@ruffsl It looks like it's worse than this. When I start my build using a POST request, all the builds start in parallel and then all the devel builds immediately fail since they depend on the runtime images.

Since all the runtime Dockerfiles for 6.5, 7.0 and 7.5 start with these lines:
https://github.com/NVIDIA/nvidia-docker/blob/master/ubuntu-14.04/cuda/7.5/runtime/Dockerfile#L1-L10
My runtime images should be able to share the layers for those commands, but since they are built in parallel, it's not the case (except for the ubuntu layers, obviously):

$ docker history flx42/cuda:7.0-runtime
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
[...]             
9a4be293a841        19 minutes ago      /bin/sh -c #(nop) ENV CUDA_VERSION=7.0          0 B                 
7410b9a2414b        19 minutes ago      /bin/sh -c apt-key adv --fetch-keys http://de   25.66 kB            
bac2ad43afa4        19 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be   0 B                 
18e862dcdeec        19 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841   0 B                 
62e3850cc26d        19 minutes ago      /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI   0 B                 
89d5d8e8bafb        2 days ago          /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
e24428725dd6        2 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB            
1796d1c62d0c        2 days ago          /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
0bf056161913        2 days ago          /bin/sh -c #(nop) ADD file:9b5ba3935021955492   187.7 MB 

$ docker history flx42/cuda:7.5-runtime
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
[...]             
92aaf1c5e65b        19 minutes ago      /bin/sh -c #(nop) ENV CUDA_VERSION=7.5          0 B                 
83968d3d71cb        19 minutes ago      /bin/sh -c apt-key adv --fetch-keys http://de   25.66 kB            
ee4242ccf3fd        19 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be   0 B                 
919a687073ec        19 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841   0 B                 
04c48fe576ca        19 minutes ago      /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI   0 B                 
89d5d8e8bafb        2 days ago          /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
e24428725dd6        2 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB            
1796d1c62d0c        2 days ago          /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
0bf056161913        2 days ago          /bin/sh -c #(nop) ADD file:9b5ba3935021955492   187.7 MB

Images runtime and 7.5-runtime use the same Dockerfile, but for the same reason they have no common layer except from ubuntu. I didn't find a way to output multiple tags from a single automated build.

In my personal github (https://github.com/flx42/nvidia-docker) I modified the devel images to do FROM flx42/cuda:tag instead of FROM cuda:tag. This should allow me to build my devel images with a second POST request, right?
Well, yes, but it's rebuilding all the images, even the runtime images. This is costly and it also means that my runtime images will get overwritten.
My devel images will build this time, but they will be built on the older runtime images:

docker history flx42/cuda:7.5-devel
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
[...]              
92aaf1c5e65b        30 minutes ago      /bin/sh -c #(nop) ENV CUDA_VERSION=7.5          0 B                 
83968d3d71cb        30 minutes ago      /bin/sh -c apt-key adv --fetch-keys http://de   25.66 kB            
ee4242ccf3fd        30 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be   0 B                 
919a687073ec        30 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841   0 B                 
04c48fe576ca        30 minutes ago      /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI   0 B                 
89d5d8e8bafb        2 days ago          /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
e24428725dd6        2 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB            
1796d1c62d0c        2 days ago          /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
0bf056161913        2 days ago          /bin/sh -c #(nop) ADD file:9b5ba3935021955492

The first layers are the same as above for flx42/cuda:7.5-runtime.
But now flx42/cuda:7.5-runtime is different:

$ docker pull flx42/cuda:7.5-runtime
7.5-runtime: Pulling from flx42/cuda
[redownloading everything]

$ docker history flx42/cuda:7.5-runtime
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
[...]               
d6f056622afd        11 minutes ago      /bin/sh -c #(nop) ENV CUDA_VERSION=7.5          0 B                 
52756da1d17b        11 minutes ago      /bin/sh -c apt-key adv --fetch-keys http://de   25.66 kB            
b179bdd62a38        11 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be   0 B                 
27ffaa5d4438        11 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841   0 B                 
3cec27432703        11 minutes ago      /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI   0 B                 
89d5d8e8bafb        2 days ago          /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
e24428725dd6        2 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB            
1796d1c62d0c        2 days ago          /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
0bf056161913        2 days ago          /bin/sh -c #(nop) ADD file:9b5ba3935021955492   187.7 MB

So, we are needlessly duplicating layers that are physically the same. And since everything is rebuilt all the time, the user will have to fetch new layers even when the image they use didn't change.

from nvidia-docker.

ruffsl commented on May 20, 2024

@flx42 , you are correct. Given the limitations of the automated build mechanics, it doesn't seem currently possible to host an automated repo on Docker Hub. Like yosifkit mentioned, official images are not built using the same rules, and so once the commits to the cuda dockerfiles settle down, this would be a nice channel to distribute images updated with upstream sources.

This all take wind out of the sails for CI testing the master branch, but I suppose sheerun or UniqueFool suggestions along with the Makefiles you've already written would work well to automate pushing current images to the NVIDIA org repo for public review given triggered events on the master branch.

from nvidia-docker.

flx42 commented on May 20, 2024

Let's give up on the Docker Hub automated repo for now. CI remains an option so I will not close this issue yet.

from nvidia-docker.

ruffsl commented on May 20, 2024

@flx42 , On a side note, you may want to keep around these links or put them in the readme/wiki somewhere for others (for at least the ubuntu tags):

I just added something similar for the official ros repo and found it as a nice method for visually verifying parent image lineage.

from nvidia-docker.

Making the nvidia/cuda automated repo about nvidia-docker HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent