Giter Club home page Giter Club logo

Comments (15)

flx42 avatar flx42 commented on May 20, 2024 2

One year after... it's finally automated! We decided to use GitLab CI since it gives us more control on what we can do. Example of a pipeline run: https://gitlab.com/nvidia/cuda/pipelines/5876874
With GitLab CI it will also be possible to add our own machines to run GPU tests on the generated images. We already do this internally.
Closing, finally.

from nvidia-docker.

flx42 avatar flx42 commented on May 20, 2024

Yes, it's definitely on our list!

from nvidia-docker.

sheerun avatar sheerun commented on May 20, 2024

btw. sometimes is easier to setup circleci and push build to hub instead

from nvidia-docker.

ruffsl avatar ruffsl commented on May 20, 2024

Another argument for automated repos would be that for others who create automated repos that happen build from of your image, it become trivial for those same people to enable a triggered build within the docker hub ecosystem. So when the Nvidia image updates with fixes, so to do users', Again the same could be done using web hooks and API calls, but keeping it simple with the docker hub interface makes it pleasant for newer users.

from nvidia-docker.

UniqueFool avatar UniqueFool commented on May 20, 2024

The Phoronix test suite comes with OpenCL support, so could be useful to do regression-testing for the automated repo: http://www.phoronix.com/scan.php?page=article&item=nvidia-amd-opencl-2015&num=1

from nvidia-docker.

flx42 avatar flx42 commented on May 20, 2024

@ruffsl For osrf/ros it looks like you also have multiple Dockerfiles with dependencies that mandates a specific order of build. How did you setup an automated build with these constraints? All the builds seems to start in parallel and thus I can't create the devel images properly since they depend on the runtime images.

@UniqueFool The problem with CI and testing is that I'm not currently aware of an open-source CI solution that would allow us to run GPU tests. We have internal solutions of course, but it will be more complex to integrate to GitHub. I will continue evaluating the solutions.

from nvidia-docker.

sheerun avatar sheerun commented on May 20, 2024

You can just run build on CI, without testing:
https://circleci.com/docs/docker#deployment-to-a-docker-registry

from nvidia-docker.

flx42 avatar flx42 commented on May 20, 2024

Sure, but it would be more convenient to deploy and test with the same solution. But indeed, the short-term solution could be to only automate the builds for now.

from nvidia-docker.

sheerun avatar sheerun commented on May 20, 2024

You need to build it on CircleCI before testing anyway ;) So it's good first step to build + upload first.

from nvidia-docker.

ruffsl avatar ruffsl commented on May 20, 2024

@flx42 , yes I've noticed this. Looking at the build details recording the build logs, I'm seeing the start times for each tag to have been triggered roughly simultaneously, with one of my higher level tags starting first. I'm rather sure the official repos do not suffer the same shortcomings ( although perhaps I've not noticed thanks to how often the upstream Ubuntu image rebuilds and triggers everything else), but I'm uncertain how to invoke the same build order in a single user repo.

I've asked about this before, but was suggested to just re-trigger the build until cascading images reach steady state, I think this is a bit silly. Another approach I first used was to break up my tags into separate repos, like suggested here. This was a bit of a hassle to manage, but did insure that an sequential order is followed. Perhaps cuda runtime and development docker repos could be separate, but the lack of tag level vs repo level triggering would be hampering to further tag specific builds. Let me dig around, perhaps something has come along since I've last looked into this. Pinging @yosifkit or @tianon ?

from nvidia-docker.

yosifkit avatar yosifkit commented on May 20, 2024

I've not seen any change on the Docker Hub that would allow images to depend upon another tag in the same repo. This is one of the reasons that the official images do not use automated builds.

from nvidia-docker.

flx42 avatar flx42 commented on May 20, 2024

@ruffsl It looks like it's worse than this. When I start my build using a POST request, all the builds start in parallel and then all the devel builds immediately fail since they depend on the runtime images.

Since all the runtime Dockerfiles for 6.5, 7.0 and 7.5 start with these lines:
https://github.com/NVIDIA/nvidia-docker/blob/master/ubuntu-14.04/cuda/7.5/runtime/Dockerfile#L1-L10
My runtime images should be able to share the layers for those commands, but since they are built in parallel, it's not the case (except for the ubuntu layers, obviously):

$ docker history flx42/cuda:7.0-runtime
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
[...]             
9a4be293a841        19 minutes ago      /bin/sh -c #(nop) ENV CUDA_VERSION=7.0          0 B                 
7410b9a2414b        19 minutes ago      /bin/sh -c apt-key adv --fetch-keys http://de   25.66 kB            
bac2ad43afa4        19 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be   0 B                 
18e862dcdeec        19 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841   0 B                 
62e3850cc26d        19 minutes ago      /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI   0 B                 
89d5d8e8bafb        2 days ago          /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
e24428725dd6        2 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB            
1796d1c62d0c        2 days ago          /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
0bf056161913        2 days ago          /bin/sh -c #(nop) ADD file:9b5ba3935021955492   187.7 MB 

$ docker history flx42/cuda:7.5-runtime
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
[...]             
92aaf1c5e65b        19 minutes ago      /bin/sh -c #(nop) ENV CUDA_VERSION=7.5          0 B                 
83968d3d71cb        19 minutes ago      /bin/sh -c apt-key adv --fetch-keys http://de   25.66 kB            
ee4242ccf3fd        19 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be   0 B                 
919a687073ec        19 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841   0 B                 
04c48fe576ca        19 minutes ago      /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI   0 B                 
89d5d8e8bafb        2 days ago          /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
e24428725dd6        2 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB            
1796d1c62d0c        2 days ago          /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
0bf056161913        2 days ago          /bin/sh -c #(nop) ADD file:9b5ba3935021955492   187.7 MB 

Images runtime and 7.5-runtime use the same Dockerfile, but for the same reason they have no common layer except from ubuntu. I didn't find a way to output multiple tags from a single automated build.

In my personal github (https://github.com/flx42/nvidia-docker) I modified the devel images to do FROM flx42/cuda:tag instead of FROM cuda:tag. This should allow me to build my devel images with a second POST request, right?
Well, yes, but it's rebuilding all the images, even the runtime images. This is costly and it also means that my runtime images will get overwritten.
My devel images will build this time, but they will be built on the older runtime images:

docker history flx42/cuda:7.5-devel
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
[...]              
92aaf1c5e65b        30 minutes ago      /bin/sh -c #(nop) ENV CUDA_VERSION=7.5          0 B                 
83968d3d71cb        30 minutes ago      /bin/sh -c apt-key adv --fetch-keys http://de   25.66 kB            
ee4242ccf3fd        30 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be   0 B                 
919a687073ec        30 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841   0 B                 
04c48fe576ca        30 minutes ago      /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI   0 B                 
89d5d8e8bafb        2 days ago          /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
e24428725dd6        2 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB            
1796d1c62d0c        2 days ago          /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
0bf056161913        2 days ago          /bin/sh -c #(nop) ADD file:9b5ba3935021955492 

The first layers are the same as above for flx42/cuda:7.5-runtime.
But now flx42/cuda:7.5-runtime is different:

$ docker pull flx42/cuda:7.5-runtime
7.5-runtime: Pulling from flx42/cuda
[redownloading everything]

$ docker history flx42/cuda:7.5-runtime
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
[...]               
d6f056622afd        11 minutes ago      /bin/sh -c #(nop) ENV CUDA_VERSION=7.5          0 B                 
52756da1d17b        11 minutes ago      /bin/sh -c apt-key adv --fetch-keys http://de   25.66 kB            
b179bdd62a38        11 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_FPR=889be   0 B                 
27ffaa5d4438        11 minutes ago      /bin/sh -c #(nop) ENV NVIDIA_GPGKEY_SUM=bd841   0 B                 
3cec27432703        11 minutes ago      /bin/sh -c #(nop) MAINTAINER NVIDIA CORPORATI   0 B                 
89d5d8e8bafb        2 days ago          /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B                 
e24428725dd6        2 days ago          /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/   1.895 kB            
1796d1c62d0c        2 days ago          /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic   194.5 kB            
0bf056161913        2 days ago          /bin/sh -c #(nop) ADD file:9b5ba3935021955492   187.7 MB 

So, we are needlessly duplicating layers that are physically the same. And since everything is rebuilt all the time, the user will have to fetch new layers even when the image they use didn't change.

from nvidia-docker.

ruffsl avatar ruffsl commented on May 20, 2024

@flx42 , you are correct. Given the limitations of the automated build mechanics, it doesn't seem currently possible to host an automated repo on Docker Hub. Like yosifkit mentioned, official images are not built using the same rules, and so once the commits to the cuda dockerfiles settle down, this would be a nice channel to distribute images updated with upstream sources.

This all take wind out of the sails for CI testing the master branch, but I suppose sheerun or UniqueFool suggestions along with the Makefiles you've already written would work well to automate pushing current images to the NVIDIA org repo for public review given triggered events on the master branch.

from nvidia-docker.

flx42 avatar flx42 commented on May 20, 2024

Let's give up on the Docker Hub automated repo for now. CI remains an option so I will not close this issue yet.

from nvidia-docker.

ruffsl avatar ruffsl commented on May 20, 2024

@flx42 , On a side note, you may want to keep around these links or put them in the readme/wiki somewhere for others (for at least the ubuntu tags):

image

I just added something similar for the official ros repo and found it as a nice method for visually verifying parent image lineage.

from nvidia-docker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.