Giter Club home page Giter Club logo

gpuci-build-environment's Introduction

gpuci-build-environment

This repo is replaced by https://github.com/rapidsai/ci-imgs and is now archived

Overview

This repo contains Docker images used by gpuCI and release images for RAPIDS. Additional gpuCI users also have custom images in this repo.

Below is a flow diagram of how the major gpuCI images relate to each other. Arrows between images imply that the source image is the FROM image for the destination image.

Image Flow Diagram

gpuCI images and relations

Public Images

The gpuci/miniforge-cuda image is the base layer that all gpuCI testing and RAPIDS release containers are built off of. This image also serves as a public image for those who want a one-to-one compatible nvidia/cuda image with miniforge installed. In addition gpuci/miniforge-cuda-driver is provided for ubuntu18.04 and centos7 only with a minimum set of conda build utilities and the NVIDIA driver installed to allow for CPU-only conda builds of most CUDA code.

Build Status gpuci/miniforge-cuda

  • Repo location
  • Dockerfile
  • Build arguments
    • Depends on upstream nvidia/cuda combinations
      • CUDA_VER - 9.0, 9.2, 10.0, 10.1, 10.2, 11.0, 11.1, 11.2
      • IMAGE_TYPE - base, runtime, devel
      • LINUX_VER - ubuntu18.04, ubuntu20.04, centos7, rockylinux8
    • Other arguments
      • FROM_IMAGE - nvidia/cuda
  • Base image
    • FROM ${FROM_IMAGE}:${CUDA_VER}-${IMAGE_TYPE}-${LINUX_VER}
      • Default - nvidia/cuda:10.2-devel-ubuntu18.04
  • Purpose
    • Contains CUDA + miniforge installed
    • Replaces nvidia/cuda and enables conda environment
    • Activates the base conda environment on launch
    • Serves as a base image for community using conda and gpuCI users to build their own custom image
  • Tag format - ${CUDA_VER}-${IMAGE_TYPE}-${LINUX_VER}
    • Supports the same options as defined in Build arguments
    • Current tags

Build Status gpuci/miniforge-cuda-driver

  • Repo location
  • Dockerfile
  • Build arguments
    • Depends on upstream nvidia/cuda combinations
      • CUDA_VER - 11.0, 11.1, 11.2
      • IMAGE_TYPE - devel
      • LINUX_VER - ubuntu18.04, centos7
    • Other arguments
      • FROM_IMAGE - gpuci/miniforge-cuda
  • Base image
    • FROM ${FROM_IMAGE}:${CUDA_VER}-${IMAGE_TYPE}-${LINUX_VER}
      • Default - gpuci/miniforge-cuda:11.0-devel-ubuntu18.04
  • Purpose
    • Adds tools needed for conda builds and uploads
    • Installs the NVIDIA driver for CPU-only builds of most CUDA code
    • Activates the base conda environment on launch
  • Tag format - ${CUDA_VER}-devel-${LINUX_VER}
    • Supports the same options as defined in Build arguments
    • Current tags

gpuCI Images

The images below are used for conda builds and GPU tests in gpuCI, see the diagram above for more context. They are ordered by their dependencies.

GPU Test Images

The gpuci/rapidsai images serve different purposes based on their IMAGE_TYPE and their RAPIDS_VER version:

Build Status gpuci/rapidsai

  • Image types - IMAGE_TYPE
    • devel - image types are used in gpuCI on nodes with NVIDIA Container Toolkit installed for running tests with GPUs. They are also used by the RAPIDS devel release images and as the base for gpuci/rapidsai-driver and gpuci/rapidsai-driver-nightly.
    • runtime - image types are used by RAPIDS base and runtime release. RAPIDS base images do not use the base type from gpuci/miniforge-cuda or nvidia/cuda as they do not have all the required files to run RAPIDS.
  • Versioning - RAPIDS_VER
    • gpuci/rapidsai uses the same versioning as the RAPIDS project
    • The current stable version of RAPIDS tracks the release/stable integration env packages
    • The current nightly version of RAPIDS tracks the nightly integration env packages
  • Dockerfiles
  • Build arguments
    • RAPIDS_VER - Major and minor version to use for packages (e.g. 21.06)
  • Base image
    • FROM gpuci/miniforge-cuda:${CUDA_VER}-${IMAGE_TYPE}-${LINUX_VER}
  • Purpose
    • Provide a common testing base that can be reused by the RAPIDS release images
    • Use the integration env packages to pull consistent versioning information for all of RAPIDS
      • NOTE: These images install the env packages to get their dependencies, but are removed after install in this container. This allows the same packages to be installed again later updating the image. It also allows PR jobs to use the devel image and override dependencies for testing purposes. With the env packages still installed there would be a conda solve conflict.
  • Tags - ${RAPIDS_VER}-cuda${CUDA_VER}-${IMAGE_TYPE}-${LINUX_VER}-py${PYTHON_VER}
    • Supports these options
      • ${RAPIDS_VER} - Major and minor version of RAPIDS (e.g. 21.06)
      • ${CUDA_VER} - 11.0, 11.2
      • ${IMAGE_TYPE} - base, runtime, devel
      • ${LINUX_VER} - ubuntu18.04, ubuntu20.04, centos7, rockylinux8
      • ${PYTHON_VER} - 3.7, 3.8, 3.9

conda Build Images

The gpuci/rapidsai-driver and images are used to build conda packages on CPU-only machines. They are from the devel images of gpuci/rapidsai. To enable some of the RAPIDS builds on CPU-only machines we leverage this container by force installing the NVIDIA drivers. This allows us to have the necessary files for linking during the build steps.

Build Status gpuci/rapidsai-driver

  • Versioning - RAPIDS_VER
    • Similar to gpuci/rapidsai these images use the RAPIDS versioning
    • gpuci/rapidsai-driver - similar to gpuci/rapidsai use the same versioning as the RAPIDS project
    • The current stable version of RAPIDS tracks the release/stable integration env packages
    • The current nightly version of RAPIDS tracks the nightly integration env packages
  • Dockerfile
  • Build arguments
    • FROM_IMAGE - Specifies the repo location; stable/nightly is determined by the value of RAPIDS_VER
    • DRIVER_VER - NVIDIA driver version to install (i.e. 440)
    • CUDA_VER and PYTHON_VER - Take the same arguments as defined in Tags below
    • RAPIDS_VER - This is used to select the FROM_IMAGE
  • Base image
    • FROM gpuci/rapidsai:${RAPIDS_VER}-cuda${CUDA_VER}-devel-ubuntu16.04-py${PYTHON_VERSION}
  • Purpose
    • Installs the NVIDIA driver/libcuda to enable conda builds on CPU-only machines
    • Built for conda builds and only contains the driver install command
    • Maintained as a way to remove the apt-get install overhead that can slow the testing/build process
  • Tags - ${RAPIDS_VER}-cuda${CUDA_VER}-devel-centos7-py${PYTHON_VER}
    • Supports these options
      • ${RAPIDS_VER} - Major and minor version of RAPIDS (e.g. 21.06)
      • ${CUDA_VER} - 11.0, 11.2
      • ${PYTHON_VER} - 3.7, 3.8

RAPIDS Images

The RAPIDS release images are based off of the gpuci/rapidsai images for stable/release images and based off of the gpuci/rapidsai-nightly images for nightly images. Scripts and templates for these images are maintained in the build repo.

For a list of available images see the RAPIDS build README.

gpuci-build-environment's People

Contributors

ajschmidt8 avatar aleksficek avatar ayodeawe avatar bdice avatar dantegd avatar dillon-cullinan avatar galipremsagar avatar gmarkall avatar jakirkham avatar jjacobelli avatar jolorunyomi avatar kkraus14 avatar mike-wendt avatar msadang avatar raydouglass avatar rlratzel avatar sevagh avatar teju85 avatar trxcllnt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gpuci-build-environment's Issues

Other CUDA Versions (11.7)

first, thanks for maintaining such a useful set of docker images. i have found it very useful.

i wanted to ask if I could help add images built on cuda 11.7 . I'd be happy to submit a PR if someone helps me understand what needs changing.

Conda environment dir is owned by root

The conda environment dir is owned by root, but the GPU runners run as a non-root user. This is problematic when CMake tries to install things such as directories because it tries to set permissions on things which may already exist and can't because it doesn't own them.

Should build RHEL 7 gcc toolchains with `--disable-libstdcxx-dual-abi`

Dockerfile.centos7-gcc7 and Dockerfile.centos7-gcc9 should pass the--disable-libstdcxx-dual-abi flag to use the pre-C++-11 ABI to align with the RHEL 6/7 toolchain ABI.

From https://gcc.gnu.org/onlinedocs/libstdc++/manual/configure.html:

--disable-libstdcxx-dual-abi
Disable support for the new, C++11-conforming implementations of std::string, std::list etc. so that the library only provides definitions of types using the old ABI (see Dual ABI). This option changes the library ABI.

Add NUMA to build machines

For ucx-py tests we need numa packages: libnuma libnuma-dev . Would it be possible to add these to the build machines used for CI for ucx-py

cc @jakirkham

conda-build version 3.18.3 is not found

I am trying to build this on my local box as part of addressing review comments from rapidsai/cudf#2132 but I am not able to build it after conda-build was pinned to 3.18.3. 9857270

Is there something I am missing?

$ sudo docker build -t test_image -f ./Dockerfile .
...
PackagesNotFoundError: The following packages are not available from current channels:

  - conda-build=3.18.3

Current channels:

  - https://conda.anaconda.org/rapidsai/label/cuda9.2/linux-64
  - https://conda.anaconda.org/rapidsai/label/cuda9.2/noarch
  - https://conda.anaconda.org/rapidsai-nightly/label/cuda9.2/linux-64
  - https://conda.anaconda.org/rapidsai-nightly/label/cuda9.2/noarch
  - https://conda.anaconda.org/nvidia/label/cuda9.2/linux-64
  - https://conda.anaconda.org/nvidia/label/cuda9.2/noarch
  - https://conda.anaconda.org/conda-forge/linux-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/free/linux-64
  - https://repo.anaconda.com/pkgs/free/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.