moby / buildkit Goto Github PK

View Code? Open in Web Editor NEW

7.9K 119.0 1.1K 43.79 MB

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit

Home Page: https://github.com/moby/moby/issues/34227

License: Apache License 2.0

Makefile 0.07% Go 97.66% Shell 0.48% Dockerfile 1.49% Assembly 0.05% HCL 0.18% Ruby 0.07%

buildkit dockerfile docker oci-image oci containers builder go cloud-native golang

buildkit's Introduction

BuildKit

BuildKit is a toolkit for converting source code to build artifacts in an efficient, expressive and repeatable manner.

Key features:

Automatic garbage collection
Extendable frontend formats
Concurrent dependency resolution
Efficient instruction caching
Build cache import/export
Nested build job invocations
Distributable workers
Multiple output formats
Pluggable architecture
Execution without root privileges

Read the proposal from moby/moby#32925

Introductory blog post https://blog.mobyproject.org/introducing-buildkit-17e056cc5317

Join #buildkit channel on Docker Community Slack

Note

If you are visiting this repo for the usage of BuildKit-only Dockerfile features like RUN --mount=type=(bind|cache|tmpfs|secret|ssh), please refer to the Dockerfile reference.

Note

docker build uses Buildx and BuildKit by default since Docker Engine 23.0. You don't need to read this document unless you want to use the full-featured standalone version of BuildKit.

Used by
Quick start
Cache
Metadata
Systemd socket activation
Expose BuildKit as a TCP service
- Load balancing
Containerizing BuildKit
- Podman
- Nerdctl
- Kubernetes
- Daemonless
OpenTelemetry support
Running BuildKit without root privileges
Building multi-platform images
- Configuring buildctl
  - Color Output Controls
  - Number of log lines (for active steps in tty mode)
Contributing

Used by

BuildKit is used by the following projects:

Moby & Docker (DOCKER_BUILDKIT=1 docker build)
img
OpenFaaS Cloud
container build interface
Tekton Pipelines (formerly Knative Build Templates)
the Sanic build tool
vab
Rio
kim
PouchContainer
Docker buildx
Okteto Cloud
Earthly earthfiles
Gitpod
Dagger
envd
Depot
Namespace
Unikraft

Quick start

ℹ️ For Kubernetes deployments, see examples/kubernetes.

BuildKit is composed of the buildkitd daemon and the buildctl client. While the buildctl client is available for Linux, macOS, and Windows, the buildkitd daemon is only available for Linux and *Windows currently.

The latest binaries of BuildKit are available here for Linux, macOS, and Windows.

Linux Setup

The buildkitd daemon requires the following components to be installed:

runc or crun
containerd (if you want to use containerd worker)

Starting the buildkitd daemon: You need to run buildkitd as the root user on the host.

$ sudo buildkitd

To run buildkitd as a non-root user, see docs/rootless.md.

The buildkitd daemon supports two worker backends: OCI (runc) and containerd.

By default, the OCI (runc) worker is used. You can set --oci-worker=false --containerd-worker=true to use the containerd worker.

We are open to adding more backends.

To start the buildkitd daemon using systemd socket activation, you can install the buildkit systemd unit files. See Systemd socket activation

The buildkitd daemon listens gRPC API on /run/buildkit/buildkitd.sock by default, but you can also use TCP sockets. See Expose BuildKit as a TCP service.

Windows Setup

See instructions and notes at docs/windows.md.

macOS Setup

Homebrew formula (unofficial) is available for macOS.

$ brew install buildkit

The Homebrew formula does not contain the daemon (buildkitd).

For example, Lima can be used for launching the daemon inside a Linux VM.

brew install lima
limactl start template://buildkit
export BUILDKIT_HOST="unix://$HOME/.lima/buildkit/sock/buildkitd.sock"

Build from source

To build BuildKit from source, see .github/CONTRIBUTING.md.

For a buildctl reference, see this document.

Exploring LLB

BuildKit builds are based on a binary intermediate format called LLB that is used for defining the dependency graph for processes running part of your build. tl;dr: LLB is to Dockerfile what LLVM IR is to C.

Marshaled as Protobuf messages
Concurrently executable
Efficiently cacheable
Vendor-neutral (i.e. non-Dockerfile languages can be easily implemented)

See solver/pb/ops.proto for the format definition, and see ./examples/README.md for example LLB applications.

Currently, the following high-level languages have been implemented for LLB:

Exploring Dockerfiles

Frontends are components that run inside BuildKit and convert any build definition to LLB. There is a special frontend called gateway (gateway.v0) that allows using any image as a frontend.

During development, Dockerfile frontend (dockerfile.v0) is also part of the BuildKit repo. In the future, this will be moved out, and Dockerfiles can be built using an external image.

Building a Dockerfile with `buildctl`

buildctl build \
    --frontend=dockerfile.v0 \
    --local context=. \
    --local dockerfile=.
# or
buildctl build \
    --frontend=dockerfile.v0 \
    --local context=. \
    --local dockerfile=. \
    --opt target=foo \
    --opt build-arg:foo=bar

--local exposes local source files from client to the builder. context and dockerfile are the names Dockerfile frontend looks for build context and Dockerfile location.

If the Dockerfile has a different filename it can be specified with --opt filename=./Dockerfile-alternative.

Building a Dockerfile using external frontend

External versions of the Dockerfile frontend are pushed to https://hub.docker.com/r/docker/dockerfile-upstream and https://hub.docker.com/r/docker/dockerfile and can be used with the gateway frontend. The source for the external frontend is currently located in ./frontend/dockerfile/cmd/dockerfile-frontend but will move out of this repository in the future (#163). For automatic build from master branch of this repository docker/dockerfile-upstream:master or docker/dockerfile-upstream:master-labs image can be used.

buildctl build \
    --frontend gateway.v0 \
    --opt source=docker/dockerfile \
    --local context=. \
    --local dockerfile=.
buildctl build \
    --frontend gateway.v0 \
    --opt source=docker/dockerfile \
    --opt context=https://github.com/moby/moby.git \
    --opt build-arg:APT_MIRROR=cdn-fastly.deb.debian.org

Output

By default, the build result and intermediate cache will only remain internally in BuildKit. An output needs to be specified to retrieve the result.

Image/Registry

buildctl build ... --output type=image,name=docker.io/username/image,push=true

To export the image to multiple registries:

buildctl build ... --output type=image,\"name=docker.io/username/image,docker.io/username2/image2\",push=true

To export the cache embed with the image and pushing them to registry together, type registry is required to import the cache, you should specify --export-cache type=inline and --import-cache type=registry,ref=.... To export the cache to a local directly, you should specify --export-cache type=local. Details in Export cache.

buildctl build ...\
  --output type=image,name=docker.io/username/image,push=true \
  --export-cache type=inline \
  --import-cache type=registry,ref=docker.io/username/image

Keys supported by image output:

name=<value>: specify image name(s)
push=true: push after creating the image
push-by-digest=true: push unnamed image
registry.insecure=true: push to insecure HTTP registry
oci-mediatypes=true: use OCI mediatypes in configuration JSON instead of Docker's
unpack=true: unpack image after creation (for use with containerd)
dangling-name-prefix=<value>: name image with prefix@<digest>, used for anonymous images
name-canonical=true: add additional canonical name name@<digest>
compression=<uncompressed|gzip|estargz|zstd>: choose compression type for layers newly created and cached, gzip is default value. estargz should be used with oci-mediatypes=true.
compression-level=<value>: compression level for gzip, estargz (0-9) and zstd (0-22)
rewrite-timestamp=true: rewrite the file timestamps to the SOURCE_DATE_EPOCH value. See docs/build-repro.md for how to specify the SOURCE_DATE_EPOCH value.
force-compression=true: forcefully apply compression option to all layers (including already existing layers)
store=true: store the result images to the worker's (e.g. containerd) image store as well as ensures that the image has all blobs in the content store (default true). Ignored if the worker doesn't have image store (e.g. OCI worker).
annotation.<key>=<value>: attach an annotation with the respective key and value to the built image
- Using the extended syntaxes, annotation-<type>.<key>=<value>, annotation[<platform>].<key>=<value> and both combined with annotation-<type>[<platform>].<key>=<value>, allows configuring exactly where to attach the annotation.
- <type> specifies what object to attach to, and can be any of manifest (the default), manifest-descriptor, index and index-descriptor
- <platform> specifies which objects to attach to (by default, all), and is the same key passed into the platform opt, see docs/multi-platform.md.
- See docs/annotations.md for more details.

If credentials are required, buildctl will attempt to read Docker configuration file $DOCKER_CONFIG/config.json. $DOCKER_CONFIG defaults to ~/.docker.

Local directory

The local client will copy the files directly to the client. This is useful if BuildKit is being used for building something else than container images.

buildctl build ... --output type=local,dest=path/to/output-dir

To export specific files use multi-stage builds with a scratch stage and copy the needed files into that stage with COPY --from.

...
FROM scratch as testresult

COPY --from=builder /usr/src/app/testresult.xml .
...

buildctl build ... --opt target=testresult --output type=local,dest=path/to/output-dir

With a multi-platform build, a subfolder matching each target platform will be created in the destination directory:

FROM busybox AS build
ARG TARGETOS
ARG TARGETARCH
RUN mkdir /out && echo foo > /out/hello-$TARGETOS-$TARGETARCH

FROM scratch
COPY --from=build /out /

$ buildctl build \
  --frontend dockerfile.v0 \
  --opt platform=linux/amd64,linux/arm64 \
  --output type=local,dest=./bin/release

$ tree ./bin
./bin/
└── release
    ├── linux_amd64
    │   └── hello-linux-amd64
    └── linux_arm64
        └── hello-linux-arm64

You can set platform-split=false to merge files from all platforms together into same directory:

$ buildctl build \
  --frontend dockerfile.v0 \
  --opt platform=linux/amd64,linux/arm64 \
  --output type=local,dest=./bin/release,platform-split=false

$ tree ./bin
./bin/
└── release
    ├── hello-linux-amd64
    └── hello-linux-arm64

Tar exporter is similar to local exporter but transfers the files through a tarball.

buildctl build ... --output type=tar,dest=out.tar
buildctl build ... --output type=tar > out.tar

Docker tarball

# exported tarball is also compatible with OCI spec
buildctl build ... --output type=docker,name=myimage | docker load

OCI tarball

buildctl build ... --output type=oci,dest=path/to/output.tar
buildctl build ... --output type=oci > output.tar

containerd image store

The containerd worker needs to be used

buildctl build ... --output type=image,name=docker.io/username/image
ctr --namespace=buildkit images ls

To change the containerd namespace, you need to change worker.containerd.namespace in /etc/buildkit/buildkitd.toml.

Cache

To show local build cache (/var/lib/buildkit):

buildctl du -v

To prune local build cache:

buildctl prune

Garbage collection

See ./docs/buildkitd.toml.md.

Export cache

BuildKit supports the following cache exporters:

inline: embed the cache into the image, and push them to the registry together
registry: push the image and the cache separately
local: export to a local directory
gha: export to GitHub Actions cache

In most case you want to use the inline cache exporter. However, note that the inline cache exporter only supports min cache mode. To enable max cache mode, push the image and the cache separately by using registry cache exporter.

inline and registry exporters both store the cache in the registry. For importing the cache, type=registry is sufficient for both, as specifying the cache format is not necessary.

Inline (push image and cache together)

buildctl build ... \
  --output type=image,name=docker.io/username/image,push=true \
  --export-cache type=inline \
  --import-cache type=registry,ref=docker.io/username/image

Note that the inline cache is not imported unless --import-cache type=registry,ref=... is provided.

Inline cache embeds cache metadata into the image config. The layers in the image will be left untouched compared to the image with no cache information.

ℹ️ Docker-integrated BuildKit (DOCKER_BUILDKIT=1 docker build) and docker buildxrequires --build-arg BUILDKIT_INLINE_CACHE=1 to be specified to enable the inline cache exporter. However, the standalone buildctl does NOT require --opt build-arg:BUILDKIT_INLINE_CACHE=1 and the build-arg is simply ignored.

Registry (push image and cache separately)

buildctl build ... \
  --output type=image,name=localhost:5000/myrepo:image,push=true \
  --export-cache type=registry,ref=localhost:5000/myrepo:buildcache \
  --import-cache type=registry,ref=localhost:5000/myrepo:buildcache

--export-cache options:

type=registry
mode=<min|max>: specify cache layers to export (default: min)
- min: only export layers for the resulting image
- max: export all the layers of all intermediate steps
ref=<ref>: specify repository reference to store cache, e.g. docker.io/user/image:tag
image-manifest=<true|false>: whether to export cache manifest as an OCI-compatible image manifest rather than a manifest list/index (default: false, must be used with oci-mediatypes=true)
oci-mediatypes=<true|false>: whether to use OCI mediatypes in exported manifests (default: true, since BuildKit v0.8)
compression=<uncompressed|gzip|estargz|zstd>: choose compression type for layers newly created and cached, gzip is default value. estargz and zstd should be used with oci-mediatypes=true
compression-level=<value>: choose compression level for gzip, estargz (0-9) and zstd (0-22)
force-compression=true: forcibly apply compression option to all layers
ignore-error=<false|true>: specify if error is ignored in case cache export fails (default: false)

--import-cache options:

type=registry
ref=<ref>: specify repository reference to retrieve cache from, e.g. docker.io/user/image:tag

Local directory

buildctl build ... --export-cache type=local,dest=path/to/output-dir
buildctl build ... --import-cache type=local,src=path/to/input-dir

The directory layout conforms to OCI Image Spec v1.0.

--export-cache options:

type=local
mode=<min|max>: specify cache layers to export (default: min)
- min: only export layers for the resulting image
- max: export all the layers of all intermediate steps
dest=<path>: destination directory for cache exporter
tag=<tag>: specify custom tag of image to write to local index (default: latest)
image-manifest=<true|false>: whether to export cache manifest as an OCI-compatible image manifest rather than a manifest list/index (default: false, must be used with oci-mediatypes=true)
oci-mediatypes=<true|false>: whether to use OCI mediatypes in exported manifests (default true, since BuildKit v0.8)
compression=<uncompressed|gzip|estargz|zstd>: choose compression type for layers newly created and cached, gzip is default value. estargz and zstd should be used with oci-mediatypes=true.
compression-level=<value>: compression level for gzip, estargz (0-9) and zstd (0-22)
force-compression=true: forcibly apply compression option to all layers
ignore-error=<false|true>: specify if error is ignored in case cache export fails (default: false)

--import-cache options:

type=local
src=<path>: source directory for cache importer
tag=<tag>: specify custom tag of image to read from local index (default: latest)
digest=sha256:<sha256digest>: specify explicit digest of the manifest list to import

GitHub Actions cache (experimental)

buildctl build ... \
  --output type=image,name=docker.io/username/image,push=true \
  --export-cache type=gha \
  --import-cache type=gha

GitHub Actions cache saves both cache metadata and layers to GitHub's Cache service. This cache currently has a size limit of 10GB that is shared across different caches in the repo. If you exceed this limit, GitHub will save your cache but will begin evicting caches until the total size is less than 10 GB. Recycling caches too often can result in slower runtimes overall.

Similarly to using actions/cache, caches are scoped by branch, with the default and target branches being available to every branch.

Following attributes are required to authenticate against the GitHub Actions Cache service API:

url: Cache server URL (default $ACTIONS_CACHE_URL)
token: Access token (default $ACTIONS_RUNTIME_TOKEN)

ℹ️ This type of cache can be used with Docker Build Push Action where url and token will be automatically set. To use this backend in an inline run step, you have to include crazy-max/ghaction-github-runtime in your workflow to expose the runtime.

--export-cache options:

type=gha
mode=<min|max>: specify cache layers to export (default: min)
- min: only export layers for the resulting image
- max: export all the layers of all intermediate steps
scope=<scope>: which scope cache object belongs to (default buildkit)
ignore-error=<false|true>: specify if error is ignored in case cache export fails (default: false)
timeout=<duration>: sets the timeout duration for cache export (default: 10m)

--import-cache options:

type=gha
scope=<scope>: which scope cache object belongs to (default buildkit)
timeout=<duration>: sets the timeout duration for cache import (default: 10m)

S3 cache (experimental)

buildctl build ... \
  --output type=image,name=docker.io/username/image,push=true \
  --export-cache type=s3,region=eu-west-1,bucket=my_bucket,name=my_image \
  --import-cache type=s3,region=eu-west-1,bucket=my_bucket,name=my_image

The following attributes are required:

bucket: AWS S3 bucket (default: $AWS_BUCKET)
region: AWS region (default: $AWS_REGION)

Storage locations:

blobs: s3://<bucket>/<prefix><blobs_prefix>/<sha256>, default: s3://<bucket>/blobs/<sha256>
manifests: s3://<bucket>/<prefix><manifests_prefix>/<name>, default: s3://<bucket>/manifests/<name>

S3 configuration:

blobs_prefix: global prefix to store / read blobs on s3 (default: blobs/)
manifests_prefix: global prefix to store / read manifests on s3 (default: manifests/)
endpoint_url: specify a specific S3 endpoint (default: empty)
use_path_style: if set to true, put the bucket name in the URL instead of in the hostname (default: false)

AWS Authentication:

The simplest way is to use an IAM Instance profile. Other options are:

Any system using environment variables / config files supported by the AWS Go SDK. The configuration must be available for the buildkit daemon, not for the client.
Using the following attributes:
- access_key_id: Access Key ID
- secret_access_key: Secret Access Key
- session_token: Session Token

--export-cache options:

type=s3
mode=<min|max>: specify cache layers to export (default: min)
- min: only export layers for the resulting image
- max: export all the layers of all intermediate steps
prefix=<prefix>: set global prefix to store / read files on s3 (default: empty)
name=<manifest>: specify name of the manifest to use (default buildkit)
- Multiple manifest names can be specified at the same time, separated by ;. The standard use case is to use the git sha1 as name, and the branch name as duplicate, and load both with 2 import-cache commands.
ignore-error=<false|true>: specify if error is ignored in case cache export fails (default: false)

--import-cache options:

type=s3
prefix=<prefix>: set global prefix to store / read files on s3 (default: empty)
blobs_prefix=<prefix>: set global prefix to store / read blobs on s3 (default: blobs/)
manifests_prefix=<prefix>: set global prefix to store / read manifests on s3 (default: manifests/)
name=<manifest>: name of the manifest to use (default buildkit)

Azure Blob Storage cache (experimental)

buildctl build ... \
  --output type=image,name=docker.io/username/image,push=true \
  --export-cache type=azblob,account_url=https://myaccount.blob.core.windows.net,name=my_image \
  --import-cache type=azblob,account_url=https://myaccount.blob.core.windows.net,name=my_image

The following attributes are required:

account_url: The Azure Blob Storage account URL (default: $BUILDKIT_AZURE_STORAGE_ACCOUNT_URL)

Storage locations:

blobs: <account_url>/<container>/<prefix><blobs_prefix>/<sha256>, default: <account_url>/<container>/blobs/<sha256>
manifests: <account_url>/<container>/<prefix><manifests_prefix>/<name>, default: <account_url>/<container>/manifests/<name>

Azure Blob Storage configuration:

container: The Azure Blob Storage container name (default: buildkit-cache or $BUILDKIT_AZURE_STORAGE_CONTAINER if set)
blobs_prefix: Global prefix to store / read blobs on the Azure Blob Storage container (<container>) (default: blobs/)
manifests_prefix: Global prefix to store / read blobs on the Azure Blob Storage container (<container>) (default: manifests/)

Azure Blob Storage authentication:

There are 2 options supported for Azure Blob Storage authentication:

Any system using environment variables supported by the Azure SDK for Go. The configuration must be available for the buildkit daemon, not for the client.
Secret Access Key, using the secret_access_key attribute to specify the primary or secondary account key for your Azure Blob Storage account. Azure Blob Storage account keys

Note

Account name can also be specified with account_name attribute (or $BUILDKIT_AZURE_STORAGE_ACCOUNT_NAME) if it is not part of the account URL host.

--export-cache options:

type=azblob
mode=<min|max>: specify cache layers to export (default: min)
- min: only export layers for the resulting image
- max: export all the layers of all intermediate steps
prefix=<prefix>: set global prefix to store / read files on the Azure Blob Storage container (<container>) (default: empty)
name=<manifest>: specify name of the manifest to use (default: buildkit)
- Multiple manifest names can be specified at the same time, separated by ;. The standard use case is to use the git sha1 as name, and the branch name as duplicate, and load both with 2 import-cache commands.
ignore-error=<false|true>: specify if error is ignored in case cache export fails (default: false)

--import-cache options:

type=azblob
prefix=<prefix>: set global prefix to store / read files on the Azure Blob Storage container (<container>) (default: empty)
blobs_prefix=<prefix>: set global prefix to store / read blobs on the Azure Blob Storage container (<container>) (default: blobs/)
manifests_prefix=<prefix>: set global prefix to store / read manifests on the Azure Blob Storage container (<container>) (default: manifests/)
name=<manifest>: name of the manifest to use (default: buildkit)

Consistent hashing

If you have multiple BuildKit daemon instances, but you don't want to use registry for sharing cache across the cluster, consider client-side load balancing using consistent hashing.

See ./examples/kubernetes/consistenthash.

Metadata

To output build metadata such as the image digest, pass the --metadata-file flag. The metadata will be written as a JSON object to the specified file. The directory of the specified file must already exist and be writable.

buildctl build ... --metadata-file metadata.json

jq '.' metadata.json

{
  "containerimage.config.digest": "sha256:2937f66a9722f7f4a2df583de2f8cb97fc9196059a410e7f00072fc918930e66",
  "containerimage.descriptor": {
    "annotations": {
      "config.digest": "sha256:2937f66a9722f7f4a2df583de2f8cb97fc9196059a410e7f00072fc918930e66",
      "org.opencontainers.image.created": "2022-02-08T21:28:03Z"
    },
    "digest": "sha256:19ffeab6f8bc9293ac2c3fdf94ebe28396254c993aea0b5a542cfb02e0883fa3",
    "mediaType": "application/vnd.oci.image.manifest.v1+json",
    "size": 506
  },
  "containerimage.digest": "sha256:19ffeab6f8bc9293ac2c3fdf94ebe28396254c993aea0b5a542cfb02e0883fa3"
}

Systemd socket activation

On Systemd based systems, you can communicate with the daemon via Systemd socket activation, use buildkitd --addr fd://. You can find examples of using Systemd socket activation with BuildKit and Systemd in ./examples/systemd.

Expose BuildKit as a TCP service

The buildkitd daemon can listen the gRPC API on a TCP socket.

It is highly recommended to create TLS certificates for both the daemon and the client (mTLS). Enabling TCP without mTLS is dangerous because the executor containers (aka Dockerfile RUN containers) can call BuildKit API as well.

buildkitd \
  --addr tcp://0.0.0.0:1234 \
  --tlscacert /path/to/ca.pem \
  --tlscert /path/to/cert.pem \
  --tlskey /path/to/key.pem

buildctl \
  --addr tcp://example.com:1234 \
  --tlscacert /path/to/ca.pem \
  --tlscert /path/to/clientcert.pem \
  --tlskey /path/to/clientkey.pem \
  build ...

Load balancing

buildctl build can be called against randomly load balanced buildkitd daemons.

See also Consistent hashing for client-side load balancing.

Containerizing BuildKit

BuildKit can also be used by running the buildkitd daemon inside a Docker container and accessing it remotely.

We provide the container images as moby/buildkit:

moby/buildkit:latest: built from the latest regular release
moby/buildkit:rootless: same as latest but runs as an unprivileged user, see docs/rootless.md
moby/buildkit:master: built from the master branch
moby/buildkit:master-rootless: same as master but runs as an unprivileged user, see docs/rootless.md

To run daemon in a container:

docker run -d --name buildkitd --privileged moby/buildkit:latest
export BUILDKIT_HOST=docker-container://buildkitd
buildctl build --help

Podman

To connect to a BuildKit daemon running in a Podman container, use podman-container:// instead of docker-container:// .

podman run -d --name buildkitd --privileged moby/buildkit:latest
buildctl --addr=podman-container://buildkitd build --frontend dockerfile.v0 --local context=. --local dockerfile=. --output type=oci | podman load foo

sudo is not required.

Nerdctl

To connect to a BuildKit daemon running in a Nerdctl container, use nerdctl-container:// instead of docker-container://.

nerdctl run -d --name buildkitd --privileged moby/buildkit:latest
buildctl --addr=nerdctl-container://buildkitd build --frontend dockerfile.v0 --local context=. --local dockerfile=. --output type=oci | nerdctl load

sudo is not required.

Kubernetes

For Kubernetes deployments, see examples/kubernetes.

Daemonless

To run the client and an ephemeral daemon in a single container ("daemonless mode"):

docker run \
    -it \
    --rm \
    --privileged \
    -v /path/to/dir:/tmp/work \
    --entrypoint buildctl-daemonless.sh \
    moby/buildkit:master \
        build \
        --frontend dockerfile.v0 \
        --local context=/tmp/work \
        --local dockerfile=/tmp/work

docker run \
    -it \
    --rm \
    --security-opt seccomp=unconfined \
    --security-opt apparmor=unconfined \
    -e BUILDKITD_FLAGS=--oci-worker-no-process-sandbox \
    -v /path/to/dir:/tmp/work \
    --entrypoint buildctl-daemonless.sh \
    moby/buildkit:master-rootless \
        build \
        --frontend \
        dockerfile.v0 \
        --local context=/tmp/work \
        --local dockerfile=/tmp/work

OpenTelemetry support

BuildKit supports OpenTelemetry for buildkitd gRPC API and buildctl commands. To capture the trace to Jaeger, set JAEGER_TRACE environment variable to the collection address.

docker run -d -p6831:6831/udp -p16686:16686 jaegertracing/all-in-one:latest
export JAEGER_TRACE=0.0.0.0:6831
# restart buildkitd and buildctl so they know JAEGER_TRACE
# any buildctl command should be traced to http://127.0.0.1:16686/

On Windows, if you are running Jaeger outside of a container, jaeger-all-in-one.exe, set the environment variable setx -m JAEGER_TRACE "0.0.0.0:6831", restart buildkitd in a new terminal and the traces will be collected automatically.

Running BuildKit without root privileges

Please refer to docs/rootless.md.

Building multi-platform images

Please refer to docs/multi-platform.md.

Configuring `buildctl`

Color Output Controls

buildctl has support for modifying the colors that are used to output information to the terminal. You can set the environment variable BUILDKIT_COLORS to something like run=green:warning=yellow:error=red:cancel=255,165,0 to set the colors that you would like to use. Setting NO_COLOR to anything will disable any colorized output as recommended by no-color.org.

Parsing errors will be reported but ignored. This will result in default color values being used where needed.

The list of pre-defined colors.

Number of log lines (for active steps in tty mode)

You can change how many log lines are visible for active steps in tty mode by setting BUILDKIT_TTY_LOG_LINES to a number (default: 6).

Contributing

Want to contribute to BuildKit? Awesome! You can find information about contributing to this project in the CONTRIBUTING.md

buildkit's People

Contributors

Stargazers

Watchers

Forkers

akihirosuda dmcgowan vdemeester tonistiigi justincormack thajeztah tiborvass imranansari curtisz pchico83 dafoo abailly tomastomecek lalyos kunalkushwaha f0 tklauser zerocry jessfraz trusch mbrukman alexxnica kryndex ijc georgekuruvillak unkaktus yushangbin purplesmoke05 fermayo hmlinux cckuok yastij borjaburgos westonsteimel littlelotta dhiltgen tomwillfixit orisano borbediana seemethere r2d4 antaress quinndiggity brawong yui-knk ii0 danielfallon int-tt adshmh hephaex anusha-ragunathan crosbymichael lumjjb fuweid hharnisc lowenna the-cc-dev hinshun nuzumglobal spaceyii chris-crone pks-os 13768324554 poacher69 alicefr davidswu ondrej-fabry faspl foobargirl macros koolhead17 rowhit lukahartwig alvagante resilientred mattlk13 vbraziel dalavancloud vanstee pslacerda h0axd erichripko ehazlett hpandeycodeit homburg christiankniep bhanditz chendave natelipus0x01 didacog juner417 jordonbiondo steven-zou beleo dschmidt etsangsplk lugeng po3rin vincentmei1734 cur3n4

buildkit's Issues

containerd: WORKDIR fails when the directory does not exist

This should create a directory. #dibs.

RFC: Distributed BuildKit (Swarm/Kubernetes/Mesos..)

Update (Dec 19, 2017): Children issues can be found here: area/distributed
Update (Nov 13, 2017): newer doc would be in docs/misc/design-distributed-mode.md: #160

PTAL
https://docs.google.com/presentation/d/18ZJRm_0h25GP0uvDDEugAeeOkB6x8nOWLVUwBroV0X4/edit?usp=sharing

Agenda:

BuildKit cluster orchestration
Cache placement & Scheduling
Artifact output

Highly related issues:

#58 (instruction cache)
#30 (snapshotter/content store)
#28 (networking)

[FeatureRequest] Respect trailing slashes in .dockerignore

If your .dockerignore looks like this

.*/

you probably want to exclude all hidden folders (but not hidden files). Using Docker 17.03.1-ce, both, hidden files and hidden folders are ignored. I'd love to see that feature in a future release.

rename cache pkg to snapshot

cache package is weirdly named, it contains the manager for snapshot references(that are automatically garbage collected, therefore the name cache). Otoh snapshot package is almost unused. I think it would make sense to move that code under snapshot: snapshot.ImmutableRef, snapshot.Manager etc. and leave the cache package for instruction/content cache.

@AkihiroSuda wdyt?

windows: propagate env across ops?

moby/moby#29048: GETENV (cc @jhowardmsft)
moby/moby#31525: ENV --lazy-expand (cc @simonferquel)

This proposal enables propagating env vars across ops.
RFC.
Also, we should look into whether we want other things to be propagated as well.

solver/pb/ops.proto:

message Meta {              
        repeated string args = 1;
        // If env_input is non-negative, the initial env is captured from the cache metadata of the corresponding op.
        int64 env_input = 2;
        // Traditional env.
        repeated string env = 3;          
        // List of "capturable" env keys. e.g. {"PATH"}
        // We require this field so as to prevent secure info from being leaked into the cache.
        repeated string env_capturable_keys = 4;
        string cwd = 5;
}

worker/worker.go:

type Worker interface {
        Exec(ctx context.Context, meta Meta, rootfs cache.Mountable, mounts []Mount, stdout, stderr io.WriteCloser) (capturedEnv []string, err error)
}

Add persistence layer to cache manager

Currently, all cache references are cleared on daemon start. When there are changes they should be persisted to disk. There are TODO comments in most of these places in cache pkg and boltdb prelude.

Add http source

HTTP source would give access to the archives pulled from the web. It can use headers for detecting when the contents have not changed.

Add git source

Git source should be able to pull in data directly based on git repo reference.

It should create one mutable snapshot reference per repository(or maybe even a single static repo). When source is repeatedly called it should just fetch the updates and then check out to a new snapshot that is actually used. It could also detect that if commit sha has not changed it will always return same snapshot reference.

project: add MAINTAINERS, github description, and so on

It might be good to add MAINTAINERS file and clarify the number of maintainer LGTMs (1 atm, right?) to merge PRs for clear governance.

@tonistiigi @thaJeztah

Print nice errors for build failures

Build failures should gracefully shut down running jobs and show the failed action with appropriate logs.

Later it should provide a way for launching a debug container into the same location where the error happened. Either with ctr or buildctl debug run

Parent snapshots should not show as in-use in du

There is a bug that a child snapshot keeping a reference to its parent is shown the same way in buildctl du as a snapshot that is currently in use.

Add mutable metadata for snapshots

Currently, blob mapping can be added to a snapshot but it should be a general purpose metadata that can be used for instruction cache values, storing cache-policy, storing secure content checksums.

moby/moby#32677 contains similar features

Multi-stage Build Issues

TL;DR

The current semantics of --from intrinsically induce pathological coupling between build stages. Its intimate binding to build stage implementation opposes the principle of encapsulation necessary to permit reuse, as well as reason, in isolation, about an individual stage's behavior. By defeating encapsulation, --from thwarts applying current Dockerfile reuse features, such as ONBUILD and inhibits the introduction of future reuse mechanisms.

To avoid the harmful traits associated to --from, the existing Build Context abstraction should be adapted so its content can be extended by mounting a stage's image file path into it, instead of introducing the new stage/image reference concept to Dockerfile development. By extending its content and introducing a mapping mechanism to the existing Build Context abstraction, the --from syntax can be eliminated, current reuse features restored, and the introduction of new reuse mechanisms unencumbered.

--from Issues
- Tight, Pathological Coupling
- Precludes ONBUILD Triggers
- Ignores Aggregate Build Context
- Complexity due to added Dockerfile abstractions
- ~~Extra Build Stage & Redundant COPYing~~ Extra layers are OK see comment
Recommendations
Comparison: Current Multistage Design vs. Recommended

Issue: Tight, Pathological Coupling

The design of --from ensures the COPY instruction tightly couples itself to the implementation of other build stages. Tight coupling results from --from’s purposely crafted facility to directly reference artifacts of other build stages, within a given Dockerfile, by stage names/positions and their physical locations (paths) in those other images.

This pathological coupling, encouraging the internals of any build stage to intimately bind themselves to any other stage within a Dockerfile, eliminates the interface boundary between stages. This absence of an interface boundary negates encapsulation prohibiting human developers and algorithms from considering an individual build stage as a “black box” when defining or analyzing its behavior.

Issue expresses itself by:

Increasing the difficulty of implementing future features that encourage Dockerfile reuse, due to the absence of encapsulation, as well as discouraging the use of existing ones (ONBUILD).
Dramatically increases the amount of manual code produced by a developer because existing "boilerplate" code cannot be reused due to its direct, rigid binding to a particular artifact (file) instance.
Simple changes, like renaming a directory containing a set of artifacts or inserting/removing a build stage, can potentially ripple through the entire set of Dockerfile commands that reference this directory or build stage.

Issue: Precludes ONBUILD Trigger Support

ONBUILD trigger support enables a developer to declaratively encode an image’s transform behavior: operations responsible for converting a set of input artifacts to output ones. This declarative code includes a specification of an input interface followed by command(s) that execute a transform. The input interface definition emerges from the union of source file artifact (directory/filename) references specified by the triggered ADD/COPY Dockerfile commands and is statically defined during the construction of the ONBUILD image while the transform consists of one or more RUN commands.

Example

Create a golang compiler image that executes ONBUILD commands to automatically produce a golang executable image but not run it. Define the input interface: the path to copy golang source file(s) for the compiler image's Build Context, as /golang/app. Name the compiler image exgolang. Create the Dockerfile for this image by modifying a copy of the Docker Hub golang:1.7-onbuild image Dockerfile.

Dockerfile Contents:

FROM golang:1.7
RUN mkdir -p /go/src/app
WORKDIR /go/src/app
# Union the source argument of each COPY/ADD to determine the trigger's 'input interface'.
# Only one COPY instruction with single source argument of ‘/golang/app’.  Therefore,
# this trigger's 'input interface' is '/golang/app'.
ONBUILD COPY /golang/app /go/src/app
ONBUILD RUN go-wrapper download
ONBUILD RUN go-wrapper install

To reuse the defined trigger behavior, simply encode a FROM statement that references the image name (FROM exgolang) configured with ONBUILD commands. By promoting the DRY principle, ONBUILD triggers dramatically increase an image’s build time utility, reliability, and adaptability while simultaneously eliminating or greatly decreasing the code required to employ this image in other Dockerfiles by other developers. Given this understanding, an ONBUILD trigger definition is remarkably akin to a function definition.

Example

Using the exgolang image created above, generate a golang server executable from source server.go located in /golang/app/.

Build Context

Dockerfile
golang/app/
  server.go

Dockerfile

FROM exgolang

Docker build command:

> docker build /-t server .

The single instruction Dockerfile above when executed by docker build:

Copies golang source from the Build Context : /golang/app directory into the image directory of /golang/app.
Downloads any dependent golang packages.
Runs the compiler generating the executable file /go/bin/app from server.go that resides in the resultant image's file system.

As described and demonstrated by example, images incorporating ONBUILD statements are analogous to function definitions. This similarity extends to the equivalence of an ONBUILD image's input interface to a function's parameter list. As in the case of a function parameter list, an ONBUILD image's body: the series of ONBUILD statements, binds (couples) to the file paths referenced by each instruction just like statements within a function body bind to its parameters. For example, the COPY issued by the trigger statement ONBUILD COPY /golang/app /go/src/app binds to the source file path: /golang/app. This file path: /golang/app is equivalent to a parameter defined for a function and performs a similar role, as it represents an interface element. Given this equivalence, why isn't there a mapping mechanism, like the one implemented for functions, that maps arguments specified by an invocation statement to parameters?

When formulating ONBUILD support, the design avoided implementing an argument to parameter mapping mechanism on the trigger invocation statement: FROM. Although this mapping mechanism is intrinsic to function invocation, I speculate, at the time when trigger support was implemented, the multistage build feature was a distant, future consideration. Meanwhile, the limitation of a single stage Dockerfile masked this issue, as the Build Context could be structured to mirror the input interface required by a single stage's ONBUILD triggers. In other words, the Build Context file path (argument) names exactly match the (parameter) names required by the ONBUILD ADD/COPY instructions. However, introducing multistage builds starkly silhouettes the absence of an argument to parameter mapping mechanism.

Multistage support forces the once "elemental" Build Context, whose content and structure was dictated by the needs of a single FROM, to become a composite one that must comply to the dependencies of two or more FROM statements. Since the problems inherent to the transformation from an elemental to composite Build Context diminish not only trigger support but also affect non-trigger statements that follow a FROM, their discussion occurs in the topic: Issue: Ignores Aggregate Build Context below. Besides this issue of composite Build Contexts, pathological coupling introduced by --from impedes applying ONBUILD triggers.

COPY trigger instructions are currently bound at the time of their creation to a Build Context file path. If COPY where to include --from which stage name/position should it bind to, as it has to resolve the stage name within the context of all other existing and future Dockerfiles? Unfortunately, without introducing another mechanism to rebind the source file path references specified by ONBUILD COPY instructions within the scope of its invocation, it's very difficult within a multistage Dockerfile to reuse existing triggered enabled images once, let alone twice.

Issue: Ignores Aggregate Build Context

Since the Dockerfile semantics before incorporating multistage assumed a single FROM statement, the expected Build Context reflected only those source artifacts located in the directory structure required by ADD/COPY commands immediately following FROM. Incorporating many FROM statements within a single Dockerfile requires a means to initially compose/aggregate the Build Context with the more elemental ones needed by each FROM then partition this composite/aggregate to supply the specific (elemental) Build Context expected by an individual FROM (stage).

Example

Using the exgolang image created above, attempt to generate three golang server executables from an Aggregate Build Context. Note, issues related to partitioning the Aggregate Build Context are broadly applicable to any multistage Dockerfile without regard to its use of ONBUILD.

Build Context

Dockerfile
golang/app/
  server.go
golang/app2/
  server.go
golang/app3/
  server.go

Dockerfile

FROM exgolang
# the following stage will simply recompile golang/app/server.go instead of golang/app2/server.go
FROM exgolang
# the following stage will simply recompile golang/app/server.go instead of golang/app3/server.go
FROM exgolang

Docker build command:

/server > docker build /-t servers .

Unfortunately, the multistage build design ignores addressing Aggregate Build Context issues by failing to provide a mechanism that both partitions and restructures the Aggregate Build Context to supply the elemental Build Context needed by a specific FROM. Therefore, executing the above docker build command copies the same golang source /server/golang/app/server.go into three distinct images, runs the compiler and generates the same server executable writing it to each image's /go/bin directory.

Additionally, when incorporating stages referencing ONBUILD triggers, current multistage Dockerfile support not only inhibits their use but when "it works" the outcome can be dangerous, especially when the trigger assumes a Build Context interface of "." (everything interface) as in COPY . /go/src. In this situation, the entire Aggregate Build Context would be accessible to any stage, thereby, polluting an individual stage's source artifact set with artifacts from all other stages.

Issue: Complexity due to added Dockerfile abstractions

Any worthwhile program must apply coupling to map its abstractions to an implementation. However, it's important to minimize coupling whenever possible. One method to reduce coupling relies on limiting the abstractions required to only the essential ones applicable to realize the encoded algorithm's objective.

The purpose of a Dockerfile is to provide the scaffolding needed to deliver source artifact(s) to a transform that then produces output artifact(s). Since the transforms, executed by the RUN command, rely on reading and writing to files within a file system, the source artifacts must be eventually mapped as files within a file system. Perhaps due to a desire to align with this necessity, the Build Context abstraction responsible for providing source artifacts was also designed to represent source artifacts as files within a file system. This design choice, matching the representation of the Build Context with the one required by the underlying transforms (files in a file system), resulted in Dockerfile commands, like COPY, whose syntax and behavior nearly mirrors that of a corresponding OS command, such as cp, and facilitated Dockerfile adoption by leveraging a developer's existing understanding of it.

The introduction of COPY --from adds a new abstraction: stage/image reference, to Dockerfile coding. This addition abstraction necessitated changing COPY's interface and weaving the resolution of stage/image references into its implementation so COPY's binding mechanisms could differentiate between Build Context and other stage/image sources. Besides adding some complexity to applying COPY, introducing the stage/image reference abstraction imposes implications for features that rely on COPY's behavior. When assessing these implications one hopes for beneficial or neutral outcomes regarding their effect. However in this situation, the rigid binding of --from to a particular stage/image precludes the use of COPY --from in any current reuse mechanism, such as ONBUILD, or future one. This negative outcome not only prevents reuse mechanisms, like ONBUILD, from referencing other stages/images but also diminishes the utility of --from, as it can't be applied in all valid contexts of the COPY instruction.

An often sighted strength of Unix derivative OSes is their insistence on mapping various abstractions, like hard drives, IPC, ... to a file. Therefore, instead of adding complexity by creating a corresponding concrete OS concept for each supported device/abstraction, which in many cases would only offer a slightly different interface, Unix designers mapped new abstractions (especially devices) to a single one - the file. Once mapped, the majority of the code written to manage/manipulate this single abstraction (file) immediately applies to the new one. Since image/stage references are essentially file path references, perhaps, in lieu of explicitly exposing --from's stage/image reference abstraction, it should be mapped to an existing abstraction: the Build Context.

Recasting the stage/image references as file paths in the Build Context confers the following benefits:

Reduces complexity by eliminating the explicit stage/image reference abstraction and the --from option. COPY reverts to its prior, simpler syntax.
Limits artifact coupling to only Build Context file paths which existed before multi-stage support.
Existing or future mechanisms that apply to a Build Context, within a Dockerfile, like partitioning, renaming, and restructuring also immediately apply to artifacts contributed by other stages within a Dockerfile without writing additional code.

Issue: Extra Build Stage & Redundant COPYing

If the objective of a multistage build is the creation of a single layer representing a runtime image, the current semantics of COPY --from requires an extra build stage and redundant COPYing when the resultant build artifacts must be assembled from more than one build stage or image.

~~##### Example~~
~~Applying the current semantics of COPY --from, create a golang webserver whose stdout and stderr is redirected to a remote logging facility as a single layer in the resulting image.~~
```
FROM golang:nanoserver as webserver
COPY /web /code
WORKDIR /code
RUN go build webserver.go

FROM golang:nanoserver as remotelogger
COPY /remotelogger /code
WORKDIR /code
RUN go build remotelogger.go

~~# extra build stage and physical coping due to semantics of COPY --from in order~~
~~# to generate single layer in next build stage~~
FROM scratch as extra_redundant_copying
COPY --from=webserver /code/webserver.exe /redundant/webserver.exe
COPY --from=remotelogger /code/webserver.exe /redundant/remogelogger.exe
COPY /script/pipem.ps1 /redundant

FROM microsoft/nanoserver as extra_redundant_copying
COPY --from=extra_redundant_copying /redundant /
CMD ["\pipem.ps1"]
EXPOSE 8080
~~```~~
The above situation generalizes to N extra build stages and X redundant copy operations when there's a desire to create a resultant image of N layers where each layer requires artifacts from more than a single stage.

Recommendations:

Eliminate direct coupling to artifacts within images from other build stages by removing --from as an option to COPY.
Support a mapping mechanism that partitions, restructures, and renames file paths defined in the Aggregate (Global) Build Context so the resulting mapped version matches the (Local) Build Context required by an individual stage. A mapping mechanism satisfying these qualities has already been proposed and explored by #12072. In a nutshell, the mechanism, implemented by the keyword CONTEXT, mounts the desired Aggregate Build Context file paths, similar to docker run -v option, into the Build Context created for an individual stage.
Support a mechanism to allow a build stage to extend the Aggregate Build Context with the output artifacts produced by that stage. Proposal #12415 offers a solution MOUNT that's analogous to CONTEXT. However, MOUNT mounts an image's file path into the Aggregate Build context instead of mounting it into the stage's local Build Context.

Applying the recommendations above, when compared to the currently implement multistage design:

Promote encoding Dockerfiles with current and future reusable build mechanisms.
Seamlessly integrate with existing Dockerfile abstractions, such as Build Context and ONBUILD triggers.
Dramatically reduce the Dockerfile code required to reuse an image when building a new one.
Eliminate the necessity of encoding extra build stages and the overhead of redundant copying.
Foster innately declarative mechanisms of CONTEXT and MOUNT proposed by the links referenced above.

Comparison: Current Multistage Design vs. Recommended

The examples below concretely contrast, through the encoding of the same scenario, the benefits offered by the recommended approached when compared to the existing multistage design.

Scenario

Using already available Docker Hub images, construct a container composed of three independent golang executables. One executable implements a webserver, another a logging device that relays messages to a remote server, while the third reports on the webserver's health.

Initial Build Context

The initial Build Context common to both examples.

Build Context (initial aggregate/global context)

  Dockerfile
  script.sh
  go/src/webserver/
    server.go
  go/src/logger/
    server.go
  go/src/health/
    server.go

Example: Current Multistage Design

FROM golang:1.7 AS webserver
COPY /go/src/webserver /go/src/webserver
WORKDIR /go/src/webserver
RUN go-wrapper download              \
 && export GOBIN=/go/bin             \
 && go-wrapper install server.go
FROM golang:1.7 AS logger
COPY /go/src/logger /go/src/logger
WORKDIR /go/src/logger
RUN go-wrapper download              \
 && export GOBIN=/go/bin             \
 && go-wrapper install server.go
FROM golang:1.7 AS health
COPY /go/src/health /go/src/health
WORKDIR /go/src/health
RUN go-wrapper download              \
 && export GOBIN=/go/bin             \
 && go-wrapper install server.go
FROM scratch AS requiredExtra
COPY --from webserver /bin/server /final/bin/webserver
COPY --from logger    /bin/server /final/bin/logger
COPY --from health    /bin/server /final/bin/health
COPY /script.sh /start.sh
FROM alpine
COPY --from requiredExtra /final /start.sh  /
ENTRYPOINT /start.sh
EXPOSE 8080

Example: Recommended Multistage Design

FROM golang:1.7-onbuid CONTEXT /go/src/webserver/:/  MOUNT /go/bin/app:/final/bin/webserver  moby/moby#1
FROM golang:1.7-onbuid CONTEXT /go/src/logger/:/     MOUNT /go/bin/app:/final/bin/logger     moby/moby#2
FROM golang:1.7-onbuid CONTEXT /go/src/health/:/     MOUNT /go/bin/app:/final/bin/health     moby/moby#3
FROM alpine CONTEXT /final/bin:/bin  /script.sh:/start.sh   moby/moby#4
COPY . /   moby/moby#5
ENTRYPOINT /start.sh
EXPOSE 8080

Differences

Recommended Multistage Design when compared to Current Multistage Design:

Encourages more declarative solutions by:
- leveraging reuse features, such as ONBUILD, that minimize developer produced code and
- declares external data dependencies via CONTEXT & MOUNT separately from the Dockerfile operations like COPY.
Seamlessly leverages current ONBUILD images.
Eliminates harmful coupling by replacing direct, rigid physical stage/image references with Build Context file paths that can be rebound, through a standard mapping mechanism, when running the Dockerfile.
Addresses issue of partitioning, structuring, and renaming Aggregate Build Context artifacts using a syntax and behavior similar to docker run -v.
Eliminates complexity of --from and stage/image reference support by replacing both with a mapping mechanism that encourages encapsulation.
Eliminates encoding extra build stage(s) and redundant copying.
Clearly delineates the input and output artifacts aiding developer comprehension.
Simplifies DAG analysis, as only FROM instructions need be parsed to reveal the data dependencies between stages.

Example: Recommended Multistage Design: Explained

CONTEXT partitions the initial Aggregate Build Context to present the Local Build Context required by the FROM. For this stage, the webserver's golang source named server.go is the only file that appears in the "root" dir of the Local Build Context. Once this stage finishes, MOUNT associates the file /go/bin/app located in the last container created by this stage to the Aggregate Build Context as /final/bin/webserver.

Local Build Context

 server.go

Aggregate Build Context

Dockerfile
script.sh
go/src/webserver/
  server.go
go/src/logger/
  server.go
go/src/health/
  server.go
final/bin/
  webserver

CONTEXT partitions the initial Aggregate Build Context to present the Local Build Context required by the FROM image. For this stage, the logger's golang source named server.go is the only file that appears in the "root" dir of the Local Build Context. Once this stage finishes, MOUNT associates the file /go/bin/app located in the last container created by this stage to the Aggregate Build Context as /final/bin/logger.

Local Build Context

 server.go

Aggregate Build Context

Dockerfile
script.sh
go/src/webserver/
  server.go
go/src/logger/
  server.go
go/src/health/
  server.go
final/bin/
  webserver
  logger

CONTEXT partitions the initial Aggregate Build Context to present the Local Build Context required by the FROM image. For this stage, the health's golang source named server.go is the only file that appears in the "root" dir of the Local Build Context. Once this stage finishes, MOUNT associates the file /go/bin/app located in the last container created by this stage to the Aggregate Build Context as /final/bin/health.

Local Build Context

 server.go

Aggregate Build Context

Dockerfile
script.sh
go/src/webserver/
  server.go
go/src/logger/
  server.go
go/src/health/
  server.go
final/bin/
  webserver
  logger
  health

CONTEXT partitions the Aggregate Build Context extended by stages 1-3 by isolating the contents of /final/bin/ directory and projecting (renaming) it as /bin/. Additionally the shell script script.sh is renamed to start.sh.

Local Build Context

  start.sh
  bin/
    webserver
    logger
    health

Create a single layer by COPYing the Local Build Context, into the root directory of alpine.

LICENSE?

I just found this repo 👍

I understand this is still a POC, but it might be better to add the LICENSE file?

Add non-interactive build progress output

For the non-tty callers buildctl build should show the event stream and logs. The data should already be available, just needs to be formatted properly.

Add proper UI view to buildctl du

Currently only shows a dump on internal objects

Set up Docker Hub

#139 (comment)

tonistiigi commented 8 minutes ago

@AkihiroSuda @abailly Should we push a version to the hub as well to simplify testing?

👍, but both https://hub.docker.com/r/moby/ and https://hub.docker.com/r/buildkit/ are taken by somebody else...

add DEFATTR for COPY/ADD

originally proposed here: moby/moby#28499 (comment)

after moby/moby#28499 gets merged, next step would to support file/dir permissions.

i'm proposing DEFATTR so that if you use multiple COPY or ADD you don't have to repeat the options:

DEFATTR chown=USER
DEFATTR chown=USER:GROUP
DEFATTR chown=UID
DEFATTR chown=UID:GID

chmod is a bit tricky because you definetely want different permissions for files and dirs:

DEFATTR chmod=a+rX
DEFATTR chmod=a+X

these would rise the x bit only if x bit is present for any of the users: chmod(1):

execute/search only if the file is a directory or already has execute permission for some user (X)

or more clearer directives for different kinds:

DEFATTR dir_chmod=0755 file_chmod=0644

Builder - Add zip support to ADD

Many windows components are available as .zip files (for example .net core, PowerShell etc...) because ADD does not support unpacking of those files code execution in the container is required creating additional unneeded layers.

Linux Example w/tar
FROM alpine
ADD foo.tar.gz /bar/

Result
foo is unpacked under the new bar directory

Windows Example w/tar
FROM microsoft/nanoserver
ADD foo.tar.gz /bar/

Result
foo is unpacked under the new bar directory

Windows Example w/zip
FROM microsoft/nanoserver
ADD foo.zip /bar/

Result
foo.zip is copied to the /bar directory i.e. bar/foo.zip :(

Allow comments in && \ # ...

When writing long, multiline RUN commands, it is helpful to be able to document them with && \ # .... However, the Dockerfile parser mistakenly merges the lines together before stripping out line comments, making it difficult to document specific lines this way.

Containerd worker does not support cancelling build

@dmcgowan

Wrong directory permissions in final image

Using Dockerfile

FROM node:6.11.2-alpine

RUN addgroup -S app && adduser -S -g app app

# Alternatively use ADD https:// (which will not be cached by Docker builder)
RUN apk --no-cache add curl \
    && echo "Pulling watchdog binary from Github." \
    && curl -sSL https://github.com/openfaas/faas/releases/download/0.6.6b/fwatchdog > /usr/bin/fwatchdog \
    && chmod +x /usr/bin/fwatchdog \
    && apk del curl --no-cache

WORKDIR /root/

# Turn down the verbosity to default level.
ENV NPM_CONFIG_LOGLEVEL warn

RUN mkdir -p /home/app

# Wrapper/boot-strapper
COPY package.json       /home/app

WORKDIR /home/app
RUN npm i

# Function
COPY index.js           /home/app

COPY function/*.json    /home/app/function/
WORKDIR /home/app/function
RUN npm i || :
WORKDIR /home/app/
COPY function           ./function
RUN chown app:app -R /home/app
#RUN chmod 777 /tmp

USER app

ENV cgi_headers="true"

ENV fprocess="node index.js"

HEALTHCHECK --interval=1s CMD [ -e /tmp/.lock ] || exit 1

CMD ["fwatchdog"]

In the permissions for the /tmp dir in final image have 0755 permissions while when the same image is built with docker the perms are 777. Locally in buildkit the permissions seem to be correct but looks like they are not properly recognized by the differ when creating layer tars.

via @alexellis

Export final snapshot chain to blobs

This should happen in the background while the solver is running.

how can i change the value of /net/core/* when build image

I want to change the value of net.core.somaxconn and net.core.netdev_max_backlog. Now it is changed by excute sysctl after setting --privileged=true when docker run. but this is fussy,Is any one can help me to save these values in docker image,I hope the value is right when container start.

Dockerfile frontend missing image metadata fields(ONBUILD etc)

For the fields that are not in oci-spec (onbuild, healthcheck etc) we should copy the definition from moby(or create a new spec pkg in moby).

RFC: control data structure for caching / source credential / worker spec

This is an idea to define control data ("LLC") that are "composed" to LLB on build time on the daemon-side.

message LLC {
	// key = LLB op digest string or an empty string (treated as the default entry)
	map<string, LLCEntry> entries = 1; 
}

message LLCEntry {
	bool invalidate_cache = 1;
	oneof opctl {
		ExecOpCtl = 2;
		SourceOpCtl = 3;
	}
}

message ExecOpCtl {
// worker spec? (e.g. "runc", "containerd")
}


message SourceOpCtl {
	string auth = 1; // `docker-credential-foobar` JSON string for docker-image source. Some JSON with ssh pubkey for git source.
}

Should we split "LLC" from LLB?

Pro: better security
Con: negative impact on reproducibility?

Proposal: mixed-platform (mixed-worker?) LLB

For LCOW
moby/moby#33854
moby/moby#33850

add bridge networking

The worker currently uses host networking. Move this to use a bridge. Example in https://gist.github.com/42a6ca6b8f21af1bead05095aa97681c

buildd can reuse docker0 if it exists, or one can be passed in with a flag.

Add content-based instruction cache

I'll tackle this next. Writing down some thoughts.

One of the problems with docker's instruction cache is that it only defines a cache between two steps. You have to solve the definition to a certain point to see if there is a next step that may be possibly cached.

Buildkit should attempt to find all the cache keys as soon as possible. You don't have to solve the whole graph or all branches to find out that a vertex data has been cached. For example, when two branches are merged together you don't have to have the data for the original branches to verify that you have the cache for the merged part as long as you can verify that the sources and the graph definition have not been updated.

Other difference is that vertexes should have multiple cache keys. For example, COPY should be cached by both definition and source content. In docker build, COPY is only fixed to content while other commands only use meta definition. This is because in docker build there is no unique cache key for the root of the context source. Also, cache keys by content should never need to be recalculated, even with --no-cache options.

Some definitions for the cache keys:
Image source: ChainID
Git source: commit-sha
Local file source: session-id
Exec: meta+cachekey of inputs, possibly meta+cachekey of input contents
Copy: meta+cachekey of inputs, meta + cachekey of input contents

A complication is that keys based on contents can't be found until the input has been solved. In the case of sources, cache key can be usually found without fully downloading the source data. The source interface would need to be updated to add an extra method for that.

@AkihiroSuda

terminology: Op->Vertex, and decouple LLB from the solver pkg

@tonistiigi Can we consider renaming s/Op/Vertex/g for readability, if it doesn't affect the design.
(Is there non-op vertex?)

containerd worker hanging

containerd/containerd@a8426ed
537b5e4

$ ./buildkit0 | sudo ./bin/buildctl --debug build
INFO[0000] tracing logs to /tmp/buildctl610568466
DEBU[0000] serving grpc connection
[+] Building 32.1s (2/19)
 => CACHED docker-image://docker.io/library/alpine:latest                                                                                                0.0s
 => => resolve docker.io/library/alpine:latest                                                                                                           2.1s
 => CACHED docker-image://docker.io/library/golang:1.8-alpine                                                                                            0.0s
 => => resolve docker.io/library/golang:1.8-alpine                                                                                                       2.1s
 => apk add --no-cache g++ linux-headers                                                                                                                32.1s

daemon log:

fetch http://dl-cdn.alpinelinux.org/alpine/v3.5/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.5/community/x86_64/APKINDEX.tar.gz
(1/16) Installing libgcc (6.2.1-r1)
(2/16) Installing libstdc++ (6.2.1-r1)
(3/16) Installing binutils-libs (2.27-r1)
(4/16) Installing binutils (2.27-r1)
(5/16) Installing gmp (6.1.1-r0)
(6/16) Installing isl (0.17.1-r0)
(7/16) Installing libgomp (6.2.1-r1)
(8/16) Installing libatomic (6.2.1-r1)
(9/16) Installing pkgconf (1.0.2-r0)
(10/16) Installing mpfr3 (3.1.5-r0)
(11/16) Installing mpc1 (1.0.3-r0)
(12/16) Installing gcc (6.2.1-r1)
(13/16) Installing musl-dev (1.1.15-r7)
(14/16) Installing libc-dev (0.7-r1)
(15/16) Installing g++ (6.2.1-r1)
(16/16) Installing linux-headers (4.4.6-r1)
Executing busybox-1.25.1-r0.trigger
OK: 164 MiB in 28 packages

apk itself seems successfully completed

design: what checksum to use for snapshot content

docker build uses a hash that contains some selection of tar headers together with the file data for detecting data changes. For directories, the hashes of subfiles are used.

In some cases, it is not correct, for example, the hash would change if a filename changes while it has no effect on the destination where it would be copied.

A bigger problem is that it does not support hardlinks. As the hash is over a tar header the second link would only hash the parent path and be completely wrong. Example of this is in https://gist.github.com/tonistiigi/775cb15d3918958020bdd0165f776005

One other solution would be to use https://github.com/containerd/continuity . A problem with that is that it is not a tree hash, so getting a hash of a directory means recursively walking it. mtree has similar problems.

We could make our hash version if we pick the right headers and solve a way to normalize hardlinks. As these checksums are local to builder, we could use a faster crypto-hash than sha256 as well.

At first, we should probably go with the current hash as everything needed for that is mostly already implemented in https://github.com/moby/moby/blob/master/builder/remotecontext/tarsum.go . But if we want to fix these above issues, now would be a time to do it.

@dmcgowan

Port over interactive builder session

Port over moby/moby#32677 and create a local file source implementation based on the incremental send implementation.

This should possibly remain the only way for accessing local user files. Meaning there will not be any way to send a tarball or access files directly from the daemon's filesystem.

Dockerfile frontend missing build-arg support

build args in FROM do not work with content trust

Using the new ability to have build args work with the FROM line, discovered that this breaks builds with content trust, as something tries to parse the FROM line and fails:

whale:arg justin$ cat Dockerfile 
ARG BASE=alpine:3.6
FROM $BASE
RUN cat /etc/alpine-release
RUN uname -a

whale:arg justin$ docker build --no-cache --build-arg BASE=alpine:3.5 .
Sending build context to Docker daemon  2.048kB
Step 1/4 : ARG BASE=alpine:3.6
 ---> 
Step 2/4 : FROM $BASE
 ---> 074d602a59d7
Step 3/4 : RUN cat /etc/alpine-release
 ---> Running in cd34a02789c8
3.5.2
 ---> 919100690231
Removing intermediate container cd34a02789c8
Step 4/4 : RUN uname -a
 ---> Running in 32e20ce11a09
Linux 3009dd4da609 4.9.36-moby moby/moby#1 SMP Wed Jul 12 17:33:58 UTC 2017 x86_64 Linux
 ---> cd5da4524cea
Removing intermediate container 32e20ce11a09
Successfully built cd5da4524cea

whale:arg justin$ DOCKER_CONTENT_TRUST=1 docker build --no-cache --build-arg BASE=alpine:3.5 .
Sending build context to Docker daemon 

error during connect: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.30/build?buildargs=%7B%22BASE%22%3A%22alpine%3A3.5%22%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&nocache=1&rm=1&shmsize=0&target=&ulimits=null: invalid reference format: repository name must be lowercase

Client:
 Version:      17.06.0-ce
 API version:  1.30
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:31:53 2017
 OS/Arch:      darwin/amd64

Server:
 Version:      17.06.0-ce
 API version:  1.30 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   02c1d87
 Built:        Fri Jun 23 21:51:55 2017
 OS/Arch:      linux/amd64
 Experimental: true

Containers: 108
 Running: 0
 Paused: 0
 Stopped: 108
Images: 637
Server Version: 17.06.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: cfb82a876ecc11b5ca0977d1733adbe58599088a
runc version: 2d41c047c83e09a6d61d464906feb2a2f3c52aa4
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.36-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 1.952GiB
Name: moby
ID: 62QR:HC6P:CL3A:X2OU:VT2R:D4OH:K7PC:W2TG:X6JP:OFZE:PJKA:F2D3
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 18
 Goroutines: 30
 System Time: 2017-07-20T07:29:32.514762657Z
 EventsListeners: 1
No Proxy: *.local, 169.254/16
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

design: relationship with content store

Containerd separates the concepts of content store that holds the image data in compressed form for distribution and snapshots that hold image data that that can be used by containers. Docker, for example, doesn't duplicate this.

The question is if buildkit should also introduce this concept or hide it behind containerd snapshot implementation. Currently, it is doing the latter. Buildkit snapshot internally has a reference to the contentstore blob and buildctl du shows the sum of their sizes.

This makes it possible to define a snapshot implementation that wouldn't need to duplicate data if the implementation is smart enough to be 100% stable.

Things start to get more complicated when preparing for importing and exporting cache. These features should work independently from snapshotter implementation or exporter type and it is likely that implementing them would need a very similar implementation to contentstore. This is also the same method how workers should share data on a distributed workflow.

Thoughts?

design: loading frontends from images

Currently, frontends(like Dockerfile support) need to be built into the buildd binary. This is a temporary solution - users should be able to maintain their own frontends and load them from images instead of requiring them to be merged in the main repo. Later same concepts should be applied to exporters as well.

The minimal set of functionality a frontend currently needs is:

executing generated LLB
resolving an image config(shared auth and performance)
reading content from a snapshot(from LLB build result)
returning a result from LLB execution as build result

I'd suggest launching frontend image with the current execution worker and using stdin/stdout streams for communication. This makes sure no special functionality(like mounting a socket) is needed from the worker implementation that may be harder to implement, for example for Windows, VMs, remote workers, etc.. gRPC could be used for protocol as it is already heavily used by other components.

A tricky part is accessing the file data. Currently, only read-only access is needed, but this may change in the future.

There are three main options:

Expose API to the frontend for file operations. Initially a single function like ReadFile(ref, path) ([]byte, error) would probably be enough but uncertain if this would be future proof. For example, it could be useful for a frontend to do a ReadDir/Stat to find files.
Add a shared mount to the frontend container. A frontend could directly mount the references and then access the data directly. The problems with this approach are that it may be harder to do cleanup as these mounts will leak to host(at least to the buildkit mount namespace). Also, we may find out that this is very hard to implement in other platforms that don't use linux containers. Another issue is that if a frontend is misbehaving it is hard for buildkit to function properly(for example frontend doesn't unmount properly and buildkit can't remove data). A benefit of this solution is that it is very powerful, a frontend could reimplement all LLB functionality if it needs.
The third option is to use reexec and relaunch the frontend process but with this time exposing a mount to the process and a communication channel between the previous and new process. This would likely put more burden to the frontend author but it may be possible to work around that with good helper libraries. It may also be bit slower than the other options as we need to launch more containers.

I'm currently leaning toward the third option, maybe using file api option in the first iteration.

@AkihiroSuda @vdemeester @tiborvass @dmcgowan

Dockerfile: unexposed variable declarations

Docker has ARG, and has ENV, the issue I have with these is that they exposed to users. Sometimes you just want an internal variable that can be passed around to multiple commands, but not exposed to the user. Personally like ARG I don't really see this needing to be baked into the image. Suggested keyword would be VAR.

Improve docker build with messages when firing rules

As pylint for Python, we could improve docker build by integrating hadolint from Lukas martinelli
https://github.com/lukasmartinelli/hadolint
and his online checker
http://hadolint.lukasmartinelli.ch/

docker build would still build an image, but the author could see the mistakes and immediately correct his Dockerfile

progressui flickers on some terminals (e.g. iTerm2)

It hurts my eyes 🙄

@tonistiigi
WDYT about using https://github.com/gizak/termui (backend: https://github.com/nsf/termbox-go) or something else instead of ANSI?

Only commit snapshots that are actually needed by next steps

The solver is not smart enough to detect this atm and commits/caches everything.

containerd buildd should refuse to start w/o snapshotter loaded

I just tried buildkit for the first time and it didn't work in the end. I suspect the issue is that I am using xfs without d_type support and therefore overlayfs snapshotter cannot be loaded.

The error message:

buildd: failed to solve: rpc error: code = Unknown desc = failed to stat snapshot: snapshotter not loaded: overlayfs: invalid argument

Since the standalone buildd refuses to start in such conditions, maybe containerd buildd should fail as well.

Logs:

containerd:

INFO[0000] loading plugin "io.containerd.snapshotter.v1.overlayfs"...  module=containerd type=io.containerd.snapshotter.v1
WARN[0000] failed to load plugin io.containerd.snapshotter.v1.overlayfs  error="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs does not support d_type. If the backing filesystem is xfs, please reformat with ftype=1 to enable d_type support." module=containerd
WARN[0000] could not use snapshotter overlayfs in metadata plugin  error="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs does not support d_type. If the backing filesystem is xfs, please reformat with ftype=1 to enable d_type support." module="containerd/io.containerd.metadata.v1.bolt"

...

ERRO[0340] (*Service).Write failed                       error="rpc error: code = Canceled desc = context canceled" expected=sha256:14d77d279c36d2e0b651b83cd95a8639ed1cf40c63778342c99920166c7d303b ref="layer-sha256:14d77d279c36d2e0b651b83cd95a8639ed1cf40c63778342c99920166c7d303b" total=75629087
ERRO[0340] (*Service).Write failed                       error="rpc error: code = Canceled desc = context canceled" expected=sha256:019300c8a437a2d60248f27c206795930626dfe7ddc0323d734143bd5eb131a6 ref="layer-sha256:019300c8a437a2d60248f27c206795930626dfe7ddc0323d734143bd5eb131a6" total=1970271

buildd:

ERRO[0294] /moby.buildkit.v1.Control/Solve returned error: invalid argument
github.com/moby/buildkit/vendor/github.com/containerd/containerd/errdefs.init
        /home/tt/dev/go/src/github.com/moby/buildkit/vendor/github.com/containerd/containerd/errdefs/errors.go:24
github.com/moby/buildkit/vendor/github.com/containerd/containerd/content.init
        <autogenerated>:1
github.com/moby/buildkit/client/llb.init
        <autogenerated>:1
github.com/moby/buildkit/client.init
        <autogenerated>:1
github.com/moby/buildkit/cache.init
        <autogenerated>:1
github.com/moby/buildkit/control.init
        <autogenerated>:1
main.init
        <autogenerated>:1
runtime.main
        /usr/lib/golang/src/runtime/proc.go:173
runtime.goexit
        /usr/lib/golang/src/runtime/asm_amd64.s:2337
snapshotter not loaded: overlayfs
github.com/moby/buildkit/vendor/github.com/containerd/containerd/errdefs.FromGRPC
        /home/tt/dev/go/src/github.com/moby/buildkit/vendor/github.com/containerd/containerd/errdefs/grpc.go:82
github.com/moby/buildkit/vendor/github.com/containerd/containerd/services/snapshot.(*remoteSnapshotter).Stat
        /home/tt/dev/go/src/github.com/moby/buildkit/vendor/github.com/containerd/containerd/services/snapshot/client.go:36
github.com/moby/buildkit/snapshot/blobmapping.(*Snapshotter).Stat
        <autogenerated>:1
github.com/moby/buildkit/vendor/github.com/containerd/containerd/rootfs.ApplyLayer
        /home/tt/dev/go/src/github.com/moby/buildkit/vendor/github.com/containerd/containerd/rootfs/apply.go:61
github.com/moby/buildkit/vendor/github.com/containerd/containerd/rootfs.ApplyLayers
        /home/tt/dev/go/src/github.com/moby/buildkit/vendor/github.com/containerd/containerd/rootfs/apply.go:41
github.com/moby/buildkit/source/containerimage.(*imageSource).unpack
        /home/tt/dev/go/src/github.com/moby/buildkit/source/containerimage/pull.go:197
github.com/moby/buildkit/source/containerimage.(*puller).Snapshot
        /home/tt/dev/go/src/github.com/moby/buildkit/source/containerimage/pull.go:182
github.com/moby/buildkit/solver.(*sourceOp).Run
        /home/tt/dev/go/src/github.com/moby/buildkit/solver/source.go:93
github.com/moby/buildkit/solver.(*vertexSolver).run
        /home/tt/dev/go/src/github.com/moby/buildkit/solver/solver.go:632
github.com/moby/buildkit/solver.(*vertexSolver).(github.com/moby/buildkit/solver.run)-fm
        /home/tt/dev/go/src/github.com/moby/buildkit/solver/solver.go:452
github.com/moby/buildkit/util/bgfunc.(*F).run.func1
        /home/tt/dev/go/src/github.com/moby/buildkit/util/bgfunc/bgfunc.go:66
runtime.goexit
        /usr/lib/golang/src/runtime/asm_amd64.s:2337
failed to stat snapshot
github.com/moby/buildkit/vendor/github.com/containerd/containerd/rootfs.ApplyLayer
        /home/tt/dev/go/src/github.com/moby/buildkit/vendor/github.com/containerd/containerd/rootfs/apply.go:66
github.com/moby/buildkit/vendor/github.com/containerd/containerd/rootfs.ApplyLayers
        /home/tt/dev/go/src/github.com/moby/buildkit/vendor/github.com/containerd/containerd/rootfs/apply.go:41
github.com/moby/buildkit/source/containerimage.(*imageSource).unpack
        /home/tt/dev/go/src/github.com/moby/buildkit/source/containerimage/pull.go:197
github.com/moby/buildkit/source/containerimage.(*puller).Snapshot
        /home/tt/dev/go/src/github.com/moby/buildkit/source/containerimage/pull.go:182
github.com/moby/buildkit/solver.(*sourceOp).Run
        /home/tt/dev/go/src/github.com/moby/buildkit/solver/source.go:93
github.com/moby/buildkit/solver.(*vertexSolver).run
        /home/tt/dev/go/src/github.com/moby/buildkit/solver/solver.go:632
github.com/moby/buildkit/solver.(*vertexSolver).(github.com/moby/buildkit/solver.run)-fm
        /home/tt/dev/go/src/github.com/moby/buildkit/solver/solver.go:452
github.com/moby/buildkit/util/bgfunc.(*F).run.func1
        /home/tt/dev/go/src/github.com/moby/buildkit/util/bgfunc/bgfunc.go:66
runtime.goexit
        /usr/lib/golang/src/runtime/asm_amd64.s:2337

documentation for containerizing buildkit (WAS: Running buildd from Mac OS)

I would like to run buildd on a Mac OS host. I suspect that if it's not possible to run it natively it should be possible to run it inside a docker container and access it through TCP/gRPC. If that's something that requires some work to be done I would be willing to contribute if someone is willing to point me to the right direction.

Add prune command to cache manager

There is no way to manually clean up the cache. Could be implemented before gc. moby/moby#32677 contains similar code

containerd: use lease instead of gc.root label

Discussed in #155

solver: top vertex not properly cached on multiple outputs

If the top vertex has multiple outputs it is not properly cached. The cache would apply from the parent of that vertex instead. This happens because getRefs can atm only return all outputs and unused outputs from the previous run were correctly deleted. For a fix the selection of the single build result output needs to be combined with the individual snapshot cache query inside getRefs().

Parallel build jobs invocations should share graph vertexes

If two graphs contain same sections the builder should be smart enough to use the digest ID to only execute once and notify both jobs waiting for the result. This is more efficient than regular caching because regular caching needs to be rechecked after every step.

There is a util/flightcontrol utility that could be useful for this. At least for the operations that only return single snapshot.

Proposal: document Dockerfile building security promises

Currently it seems very hard to find any documentation on what kind of security is promised by Dockerfile execution, especially if building Dockerfiles from untrusted sources.

As far as I can gather, the security is as follows:

context directory contents are available, but there is no access to files outside it
full access to network is available as if running an untrusted docker image
escalating access to host is prevented with the same strength as normal untrusted docker images
resource constraints are given in docker build command line

This means that if these limitations are acceptable, it is possible to allow building Dockerfiles from untrusted sources without creating a separate VM or similar to contain the build.

I assume something like this is already done by Docker Hub when builds are submitted to it, as it probably does not use a separate VM for each build, so the priviledge separation probably has been tested quite rigorously.

Is my assessment correct? Should this be explicitly documented somewhere?

Dockerfile integration

moby/moby#33492 is making dockerfile parser much more reusable. After it has been merged we should try to integrate that package as the first frontend to buildkit.

@simonferquel @dnephin

performance issues with large graphs

I tested the solver with some large graphs(200+ vertexes and lot of sharing) and considerable amount of cpu was used in the buildd process itself.

Determined the following areas that need improving:

inputs resolution is not cached(only reference resolution per node is)
shared sections of same job don't dedupe progress logs, especially bad for pull logs of commonly used images
content based instruction cache tries too many possibilities for finding best match. Should be separated into Get() and Probe() so branches can be invalidated early.
content based instruction cache should be able to determine invalid combinations. If there is no match for a single input there is no need to attempt matching the others. Moreover the content hash logic should move from operation to the solver so that input data does not need to be in the same node to check for a match. This is important for the distributed cases.
on long chains, every vertex captures the whole progress references snapshot. Don't have a good solution for that yet.

moby / buildkit Goto Github PK

buildkit's Introduction

BuildKit

Used by

Quick start

Linux Setup

Windows Setup

macOS Setup

Build from source

Exploring LLB

Exploring Dockerfiles

Building a Dockerfile with buildctl

Building a Dockerfile using external frontend

Output

Image/Registry

Local directory

Docker tarball

OCI tarball

containerd image store

Cache

Garbage collection

Export cache

Inline (push image and cache together)

Registry (push image and cache separately)

Local directory

GitHub Actions cache (experimental)

S3 cache (experimental)

Azure Blob Storage cache (experimental)

Consistent hashing

Metadata

Systemd socket activation

Expose BuildKit as a TCP service

Load balancing

Containerizing BuildKit

Podman

Nerdctl

Kubernetes

Daemonless

OpenTelemetry support

Running BuildKit without root privileges

Building multi-platform images

Configuring buildctl

Color Output Controls

Number of log lines (for active steps in tty mode)

Contributing

buildkit's People

Contributors

Stargazers

Watchers

Forkers

buildkit's Issues

TL;DR

TOC

Issue: Tight, Pathological Coupling

Issue: Precludes ONBUILD Trigger Support

Example

Example

Issue: Ignores Aggregate Build Context

Example

Issue: Complexity due to added Dockerfile abstractions

Issue: Extra Build Stage & Redundant COPYing

Recommendations:

Comparison: Current Multistage Design vs. Recommended

Scenario

Initial Build Context

Example: Current Multistage Design

Example: Recommended Multistage Design

Differences

Example: Recommended Multistage Design: Explained

Should we split "LLC" from LLB?

Recommend Projects

Recommend Topics

Recommend Org

Building a Dockerfile with `buildctl`

Configuring `buildctl`