buildpacks / spec Goto Github PK

View Code? Open in Web Editor NEW

250.0 18.0 69.0 1.31 MB

Specification for Cloud Native Buildpacks

Home Page: https://buildpacks.io

License: Apache License 2.0

Shell 100.00%

cncf

spec's Introduction

Cloud Native Buildpacks Specification v3

This specification defines interactions between a platform, a lifecycle, a number of buildpacks, and an application

For the purpose of transforming that application into an OCI image and
For the purpose of developing or executing automated tests on that application.

A buildpack is software that partially or completely transforms application source code into runnable artifacts.

A lifecycle is software that orchestrates buildpacks and transforms the resulting artifacts into an OCI image.

A platform is software that orchestrates a lifecycle to make buildpack functionality available to end-users such as application developers.

Notational Conventions

Key Words

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC 2119.

The key words "unspecified", "undefined", and "implementation-defined" are to be interpreted as described in the rationale for the C99 standard.

An implementation is not compliant if it fails to satisfy one or more of the MUST, MUST NOT, REQUIRED, SHALL, or SHALL NOT requirements for the protocols it implements. An implementation is compliant if it satisfies all the MUST, MUST NOT, REQUIRED, SHALL, and SHALL NOT requirements for the protocols it implements.

Operating System Conventions

When a word or bullet point is prefixed with a †, it SHALL be assumed to apply only to Linux stacks.

When a word or bullet point is prefixed with a ‡, it SHALL be assumed to apply only to Windows stacks.

When the specification denotes a "shell", Linux stacks MUST use the Bourne Again Shell (bash) version 3 or greater and Windows stacks MUST use Command Prompt (cmd.exe) unless otherwise specified.

Interpreting Paths for Windows

When the specification denotes a filesystem path using POSIX path notation (e.g. /cnb/lifecycle), this notation SHALL be interpreted to represent a path where all POSIX file path separators are replaced with the Windows filepath separator (\) and absolute paths are assumed to be rooted in the default drive (e.g. C:\cnb\lifecycle).

When the specification refers to an executable file with POSIX path notation (e.g. /cnb/buildpacks/bp-a/1.2.3/bin/detect), this notation SHALL be interpreted to represent one of two possible files: one with the suffix .exe (e.g. C:\cnb\buildpacks\bp-a\1.2.3\bin\detect.exe) or with the suffix .bat (e.g. C:\cnb\buildpacks\bp-a\1.2.3\bin\detect.bat).

When the specification refers to a path in the context of an OCI layer tar (e.g. /cnb/buildpacks/bp-a/1.2.3/), this path SHALL be interpreted to be prefixed with Files (e.g. Files/cnb/buildpacks/bp-a/1.2.3/). Note: path separators in OCI layer tar headers MUST be / regardless of operating system.

Sections

API Versions

These documents currently specify:

Buildpack API: 0.10
Distribution API: 0.3
Platform API: 0.14

spec's People

Contributors

Stargazers

Watchers

spec's Issues

Clarification on Buildpack Name

This is most a reminder for me to include this in a future v3 discussion

We should clarify if the buildpack.name in buildpack.toml is meant for humans or machines. Today, the buildpack registry uses it as a human readable name (like Java or Ruby).

[RFC 0065] Add an extension spec for builders

RFC 0065
buildpacks/rfcs#116

specify to use semver with buildpack versions in buildpack.toml

The specs don't specify the versioning conventions that should be used with buildpacks, even though https://github.com/buildpacks/libbuildpack/blob/b2e4a9ef07aff26c430dfac2fbc928ee4164af35/buildpack/info.go#L27 expects a semver.

We could also include a definition of what types of changes would update the major/minor/patches (or this could go in the docs).

Consider adding `launch` and `build` booleans as top-level keys in the build plan

This convention seems to be used used universally.

Specify process specific env vars

Buildpack API should be updated with results of buildpacks/rfcs#83

RFC text

Consider setting build-time environment variables during detection

Example use case: BUNDLE_GEMFILE is set to a Gemfile with a different name.

Application Name Exposed at Runtime

In many cases running applications need to identify themselves when connecting to other systems (e.g. APM's registering with servers). Currently there is no standardized way for applications to do this with the Cloud Native Buildpack specification.

I propose that a CNB_APPLICATION_NAME environment variable be exposed, at runtime only, with a logical name that identifies this application. This MUST always be populated, but platforms are free to populate it with whatever makes the most sense for their users.

Publicize Goals, Milestones, and Progress of Spec

In order to help the community have visibility into the project, it's important to publicize the goals, milestones (bonus for a non-binding schedule), and progress of the spec as it moves towards official release.

Buildpack ID in TOML and Registry

This is most a reminder for me to include this in a future v3 discussion

The buildpack.id in buildpack.toml as defined in the spec is somewhat not inline with the potential ID requirements of the Buildpack Registry.

Windows Executables

The spec should codify the names of the windows executables that make up the entry point to a buildpack (i.e. detect, develop, and build). The goal should be to have a single buildpack that can handle Windows as just another stack rather than requiring a Linux variant and a Windows variant. Luckily Windows will help us here since it will require an extension on any executables. So I believe that we can specify a detect.[exe|cmd|bat], develop.[exe|cmd|bat] and build.[exe|cmd|bat].

[RFC 0051] Override Env Vars by Default

RFC 0051
RFC PR

When env file has no suffix, override.

[RFC 0052] Opt-in Layer Caching

RFC 0052
RFC PR

Require buildpacks to explicitly indicate intent to reuse layers as describe in the RFC.

Remove `version` key from build plan

Builpdack API should be updated as described in buildpacks/rfcs#82

RFC text

Monorepos and subdirs

It's common to want to run a buildpack against a sub-directory within an app repo. This is particularly necessary when the repo is a monorepo composed of multiple applications. For example, it might have the structure

my-monorepo
├── backend
│   └── pom.xml
├── frontend
│   └── package.json
└── worker
    └── requirements.txt

We should consider supporting this use case, or explicitly rule it out.

api key in buildpack.toml can configure lifecycle's buildpack API

Document that api in buildpack.toml will configure the lifecycle's buildpack API.

Result of buildpacks/rfcs#79
RFC text

When api is set in buildpack.toml the lifecycle must conform to the given buildpack API spec or fail.

Persistent build metadata

The buildpack cache is ephemeral, and a build should work even if the cache is not provided (or deleted to run a fresh build).

However, there are certain values that need to be persisted across builds, even if the cache is cleared. One example is the Rails secret_key_base.

Specify report.toml

As per buildpacks/rfcs#70 the export phase should produce a report.toml file.

RFC text

CNB_PLATFORM_API should configure lifecycle platform API

Platform API should be updated to describe new CNB_PLATFORM_API behavior.

Changes from buildpacks/rfcs#79.
RFC text

We will no longer assume a given lifecycle implements exactly one platform API version. Going forward CNB_PLATFORM_API can be used to configure the lifecycle's platform API rather than being a strictly pass/fail check.

Theses changes are being incorporated as part of #87

Add CNB_BUILDPACK_DIR environment variable

The lifecycle should set CNB_BUILDPACK_DIR in the buildpack execution environment

Result of buildpacks/rfcs#71
RFC text

Reproducible image builds

The current lifecycle implementation enables buildpacks to perform reproducible image builds by:

Clearing the FS timestamps of all the copied app files before the build starts
Clearing the tar archive timestamps in each exported layer

Reproducible image builds is a key feature of CNB that other OCI image building solutions (like Dockerfiles) lack. We should add this to the spec!

Note: I suggest we set all times to one second after the epoch (like jib does) to avoid unusual behavior when timestamps are zero.

Document Platform API 0.3

This should entail, but is not limited to:

Execution of lifecycle binaries
- ... including flags
- ... including environment variables
File schemas

Compatibility version in buildpack.toml

This is most a reminder for me to include this in a future v3 discussion

Should the buildpack.toml have a key to indicate compatibility with a specific API version?

Similarly (but different) should it have a key for schema version, which would allow us to change the buildpack.toml schema in the future without breaking anything.

Format of a distributable buildpack

We should maybe define the format of a buildpack distributable in the platform.md (i.e. is it a tarball, zip, either/or).

Align Buildpack ID between buildpack spec, platform spec, and lifecycle

This issue is a placeholder for some work I'm doing to align the Buildpack ID format in the buildpack spec with the Buildpacks Directory Layout in the platform spec and with the lifecycle, which today does not correctly honor / chars in the Buildpack ID.

The gist of the problem is that Buildpack ID's can contain / characters, which do not work well with the "Each top-level directory is a buildpack ID" in the platform spec.

I'm working toward one of the following solutions:

The lifecycle includes a command(s) to setup and potentially download artifacts in the /buildpacks directory (there are some implications here around coupling the builder image to the lifecycle it contains).
Add a path element to the order.toml (basically no one, including me, wants this).
Specify the substitution pattern for / in the spec.

Instance Index Exposed at Runtime

I propose that a CNB_INSTANCE_INDEX environment variable be exposed, at runtime only, with a logical index that identifies this instance. This MUST always be populated, but platforms are free to populate it with whatever makes the most sense for their users. Note that as a logical index, it is expected that over time, the index is re-used to represent different processes instances.

[Request] Provide mechanism for buildpacks to provide more information about failures

Currently the spec provides little support around what is likely the most interesting
case for both the buildpack developer and the application developer, when things go wrong:

https://github.com/buildpack/spec/blob/master/buildpack.md#build

At Heroku, we consistently see a small percentage of builds fail. There are a host of reasons a build might fail:

The developer hits a bug in a dependent library
The application developer has introduced a bug
The buildpack itself has a bug
They have misconfigured the build somehow
They exceed the memory provided by the build machine
Network partitions
Package manager is down
Typos in manifest files
etc

As an example, the number one reason that Node builds fail on Heroku is the app's package.json being incorrectly formatted JSON. At times this can account for 10%+ of Node failures. Most of the other common failures fall into only a handful of categories.

The software most likely to understand what went wrong is the buildpack itself, but it is limited in how it can communicate that information to other systems.

The ability to emit data on failure would allow a buildpack to surface information to the rest of the system which could provide helpful guidance to the user for common errors, or keep track of the relative rate of specific error types. If there is a sudden spike in UNKNOWN errors, this can alert the buildpack authors who can address the issue.

Multicall Launcher

Update platform API to match new launcher flow describe in Launcher Arguments RF (RFC-0045)

RFC PR: buildpacks/rfcs#84
RFC Text

Application Layers and Metadata

Currently, an application's files exist on a single layer regardless of what transformations are made to it. In the future, we'd like to be able to slice and dice that application layer while maintaining transparent overlay semantics. One example use-case is to separate often changed configuration files from rarely changed application code; when pushing deltas to edge-nodes, transferring the smaller layer containing only configuration can lead to performance benefits.

A proposal for how to accomplish this is to add an optional <layers>/application.toml file. The contents of this file would contain both generic metadata as well a collection of file paths for each layer:

[metadata]
build-time = "5 December 2018, 13:55 EST"
rails-secret = "76456AE8"

layers = [
  [
    "/app/config/application.rb",
    "/app/config/environments/*"
  ],
  [
    "/app/app/assets/**"
  ]
]

The file contains a metadata map that is straight key-value storage. In addition multiple layers can be defined containing glob-compatible patterns of files to include in layers. Any files in the <app> directory that are not explicitly covered will be included on a default layer to ensure that they are not lost. This behavior, including all unlisted files in a default layer, allows the <layers>/application.toml file to be optional and ignored by buildpacks that do not care about slicing and dicing the application.

Add required env vars to stack build image spec

The follow env vars need to be set on the stack build image.

PACK_BUILDPACKS_DIR
PACK_LAUNCH_DIR
PACK_ORDER_PATH
PACK_GROUP_PATH
PACK_PLAN_PATH
PACK_USER_ID
PACK_GROUP_ID

We should call this out in the spec

Consider using stdout in bin/build instead of launch.toml

As an alternative to writing a launch.toml, the bin/build script could be considered a proper program in the Unix sense: it writes status info (such as all build progress info) to stderr, and its program output, which more or less is the launch info, to stdout.

It's trivial to simply send everything that's written by programs or echo to stderr at the beginning of a script:

exec 1>&2

That way, buildpack authors must not pay special attention to e.g. programs they're calling misbehaving by printing informational output to stdout instead of stderr.

Same then for bin/develop, which would no longer have to write develop.toml.

/cc @jkutner

Run image ENTRYPOINT should be specified in the stack's run image, not by the exporter

We should remove this: https://github.com/buildpack/spec/blob/master/buildpack.md#phase-4-export
And add a section to the platform spec describing how the stack sets ENTRYPOINT.

(We should change this back if we ever make the export phase responsible for copying the launch-responsible lifecycle component to the final image.)

Move windows related buildpack API changes to buildpack/0.4

Windows related changes to the buildpack.md, currently on the platform/0.4 branch should be removed from platform/04 and added to the buildpack/0.4 branch.

Application descriptor

Buildpack consumers often need to customize the commands used to run their images, run the same image with multiple different commands, or define the buildpacks they want to run on their app. It's also common to want to keep this information under source control with the application code (which helps when forking an app).

To solve this, we may need to introduce an application descriptor file to the v3 spec. In v2, this exists in the form of the manifest.yml, Procfile, .buildpacks, and app.json files.

The possible elements included in a v3 application descriptor file might be:

List or groups of buildpacks (used by lifecycle to override default groups)
Process types, and commands (used to override launch.toml)
Environment variables used during the build (for example MAVEN_OPTS, which is honored by mvn).
Arbitrary key-value pairs for use by a buildpack

The application descriptor might be named something like:

app.toml
cnb.toml
manifest.toml
launch.toml (we would need to reconcile this with the existin launch.toml)
cnb.xml

Alternatives

Instead of making the application descriptor a part of the spec (and thus the lifecycle), it could be something the pack CLI uses/parses and passes to the lifecycle as values.
- This has the drawback of requiring that all lifecycle clients/consumers be aware of this file format if we want consistency.
We could leave the application descriptor up to the platform(s). For example, Heroku could support a heroku.yml file that contains a list of buildpacks or env vars that are interpreted by Heroku.
- But this means that an app with a heroku.yml would not work with pack or another platform.
Do nothing.
- The image can be run with the --entrypoint option.
- It is possible to customize image commands with a Dockerfile that has only FROM and ENTRYPOINT lines.
- Also, many platforms, including GCP, have platform specific mechanisms for overriding the entrypoint of an image.
- However, the above alternatives don't solve the need for a serialized list of buildpacks.

Lifecycle should be responsible for handling .profile.d on Windows in a Windows-idiomatic way

The spec should define a policy for handing the contents of .profile.d (or an equivalent that is Windows-specific) in a Windows-idiomatic way. Note this does not mean that it should do some conversion from a linux script to a Windows script, but rather, given a buildpack writing a Windows environment script, the lifecycle should handle it properly.

Change PACK_* names to CNB_*

The naming of specified variables as PACK_* is an artifact of the long ago naming of the project. Now that we're widely referred to as Cloud Native Buildpacks, we should update the naming to use CNB_* as it more close relates to the community's understanding.

Add MAINTAINERS/OWNERS

Can you list the maintainers/owners of the project?

[Request] Provide mechanism for buildpack instrumentation

Potentially related to #45

As a buildpack author, I often want a bird's eye view of what is happening during builds in production. Some of the questions include:

How long are they taking?
Which ones are slow and what do they have in common?
What versions of this binary are being installed?
How much memory are builds using?
Which part of the build is taking the most time?
What are the relative failure rates between different versions of this package manager?

In our system at Heroku, we inject a $BUILDPACK_LOG_FILE ENV var for official buildpacks that the buildpack can log out to that gets read back in by the build system. This works well, but consideration for this use-case as a first-class citizen in the spec would be an excellent addition.

Add buildpack homepage to io.buildpacks.build.metadata label

Lifecycle feature request: buildpacks/lifecycle#349

App image metadata should contain the homepage of each participating buildpack, so that users can easily lookup information about the buildpacks used to generate any given image.

Versioned spec docs

For a better presentation and traceability, now that we have the concept of Buildpack API and Platform API as part of buildpacks/rfcs#19 could we start versioning documentation based on that same scheme in this repository?

For example:

/buildpack-api-0.2.md
/buildpack-api-next.md
/distribution-api-next.md (not currently versioned)
/platform-api-0.2.md
/platform-api-next.md

This creates a hard separation between versions across multiple components of the overall spec. It additionally allow us to backfill documentation for versioned documentation where we see it necessary while also being able to plan for api changes that we want to lump together in something like platform-api-next.md.

Notes

We could keep old api versions as references but could eventually prune them once we deem it necessary.

Related issues

Prior Art

Shell process args

Update Buildpack API to allow for intuitive use of arguments by direct=false processes, as described in RFC-0045

RFC PR: buildpacks/rfcs#84
RFC Text

Buildpack dependencies

This is a proposal for a mechanism that would allow one buildpack to require the execution of another buildpack.

Overview

A Buildpack dependency mechanism would introduce an optional dependencies key to the buildpack.toml with the following structure:

[[dependencies]]
id = "<buildpack ID (required)>"
version = "<buildpack version (optional default=latest)>"
optional = "<bool (optional default=false)>"
uri = "<url or path to the buildpack (optional default=urn:buildpack:<id>)"

The lifecycle will read the dependencies key and construct an order.toml entry with those buildpacks included with the primary buildpack. The dependencies of the dependent buildpacks will also be resolved into the order.toml.

id (string, required): the ID (as defined in the spec of another buildpack
version (string, optional, default=latest): the version (as defined in the spec of the dependent buildpack. The default is to use the latest available version of the buildpack (resolution of this value may be platform-dependent).
optional (bool, optional, default=false): Defines whether this buildpack will be optional in order.toml.
uri (string, optional, default=urn:buildpack:<id>): The exact location of the dependent buildpack. If not specified the platform will resolve the urn:buildpack:<id> (making the resolution platform dependent).

Use Case: Maven & JVM

The Maven buildpack has a dependency on the JVM buildpack. In v2, the Heroku Maven buildpack manually downloads and runs the JVM buildpack before doing anything else. It installs the JVM, and sets up the PATH and other environment vars.

In v3, we want to keep this behavior. I thought that a libjvmbuildpack would do the trick, but that won't allow us to leverage the buildpack spec's layer behavior with dirs like <layer>/env and <layer>/bin.

The advantages of the Maven buildpack having a dependency on the JVM buildpack are the following:

They remain discrete buildpacks. The JVM buildpack can be used on it's own, and the Maven buildpack can be used with a different JVM buildpacks (for using an alternate vendor) or without a JVM buildpack (leveraging the stack image if possible).
The JVM buildpack can be updated and release independently of the Maven buildpack. For example, adding support for new JVM versions or behaviors (like changing default JAVA_OPTS) can be done on it's own.

Example

The Maven buildpack has the following in it's buildpack.toml:

[[dependencies]]
id = "heroku/jvm"

The lifecycle will download the heroku/jvm buildpack from the platform's default buildpack-registry. Then it will construct the following order.toml:

[[groups]]
labels = ["maven-with-dependencies"]
buildpacks = [
  { id = "heroku/jvm", version = "v42" },
  { id = "heroku/maven", version = "v123" },
]

Use Case: Ruby & Node.js

The Ruby buildpack requires much of the same logic that lives in a Node.js buildpack (installing Node, NPM, Yarn, etc and determining which commands to run). Rather than duplicate this logic, a dependency would allow for the buildpack itself to be the unit of execution.

The advantages of using a dependency mechanism over other options include:

Consumers of the heroku/ruby buildpack do not need to be aware of the heroku/nodejs buildpack and it's relationship to the Ruby buildpack.
The Node.js buildpack can be updated and release independently of the Ruby buildpack. For example, new support for Node.js versions or behaviors can be implemented and released on it's own.

Example

The Ruby buildpack will have the following in it's buildpack.toml:

[[dependencies]]
id = "heroku/nodejs"
version = "v1"
optional = true

The lifecycle will download the heroku/nodejs buildpack from the platform's default buildpack-registry. Then it will construct the following order.toml:

[[groups]]
labels = ["ruby-with-dependencies"]
buildpacks = [
  { id = "heroku/nodejs", version = "v1", optional = true},
  { id = "heroku/ruby", version = "v321" },
]

Open Questions

What is the contract between the lifecycle and the platform when resolving buildpack URIs in a platform specific way?

Proposal 1: The lifecycle will need to include a mechanism for installing a buildpack (even without this question, because a fixed URL needs to be resolved). This mechanism could resolve urn:buildpack: based on some pattern or base URL defined as an env var like BUILDPACK_REGISTRY_URL.
Proposal 2: use environment variables to point to an executable that installs a buildpack from a given URI. The lifecycle then executes NewBuildpackMap as normal.

What happens if the lifecycle can't find one of the dependencies (i.e. the platform cannot resolve the buildpack URI)?

Proposal: The build fails

Allow layers to be created as a non runtime user - Security

For security hardening, I would like buildpacks to be able to create layers as a user other than the runtime user (PACK_USER_ID) so that at runtime the directory is readable but cannot be written to.

CNB_SERVICES is not well documented

There's an RFC which mentions the environment variable CNB_SERVICES and how the contents of the variable are not well defined.

https://github.com/buildpacks/rfcs/blob/2c6ac8da87d3f6daddbbb859f258925047f15179/text/0012-service-binding.md#L17

But also, I haven't seen any references to this environment variable in the spec either.

I see that the PR to spec was approved and then closed, with no reference to any replacement PR.
#57

So I'm unsure of the state of this variable

Add cache layers to layer types in buildpack spec

I agree with feedback from @ryanmoran re: https://github.com/buildpacks/spec/blob/main/buildpack.md#layer-types
Originally posted in the buildpacks slack #spec channel:

It reads like cache is purely a modifier of either build or launch and that it doesn’t really have a use on its own. I think you can intuit that this might not be the case, but I don’t feel like it says that explicitly.

I think we should add Cache Layers to the list of layer types along with a description of the behavior of cache-only layers.

The build plan should lose entries as they are provided by each /bin/build

Note: This proposal has been edited. Comments below it may be outdated. See the edit history for previous versions.

If a buildpack’s /bin/build provides a dependency specified in the build plan, then the buildpack should be able to remove that build plan entry so that it is not provided to the next buildpack’s /bin/build. Otherwise, the same dependency may unintentionally be provided multiple times.

Advantages

This makes it possible for buildpacks to provide basic/generic versions of dependencies that more specialized buildpacks could provide if added to the group.
This makes the provider key unnecessary in almost all cases.
This makes it easy for operators to override buildpack dependencies by placing custom buildpacks in front of the group. (This is a current CF buildpack feature called "manifest overrides.")
The interface changes below make the /bin/build and /bin/detect interfaces much more consistent.
The interface changes below allow for a new type of post-build metadata collection that could be used for security scanning/auditing.
The interface changes below make build-time environment variables available during detection.
New functionality (removing entries from the build plan) is entirely optional

Interface Changes

Modify the input and output interface for detection:

Executable: /bin/detect <platform[A]>, Working Dir: <app[AR]>

Input	Description
`/dev/stdin`	Merged plan from previous detections (TOML)
`<platform>/env/`	User-provided environment variables for build
`<platform>/plan/`	Contributions to the build plan
`<platform>/#`	Platform-specific extensions

Output	Description
[exit status]	Pass (0), fail (100), or error (1-99, 101+)
`/dev/stdout`	Logs (info)
`/dev/stderr`	Logs (warnings, errors)

To contribute plan entries, buildpacks write entries to <platform>/plan/<name>. Entries do not need to have a top-level <name> key.

Modify the input interface for build:

Executable: /bin/build <platform[A]> <cache[EC]> <launch[EI]>, Working Dir: <app[AI]>

Input	Description
`/dev/stdin`	Build plan from detection (TOML)
`<platform>/env/`	User-provided environment variables for build
`<platform>/plan/`	Contributions to the build plan
`<platform>/#`	Platform-specific extensions

To remove plan entries, buildpacks write empty files to <platform>/plan/<name>. Entries do not need to have a top-level <name> key. This file could be metadata about the exact dependency provided, e.g., checksums not available during build.

PDF of specs

OCI releases a PDF of their spec when they do releases. Can we do something similair? https://github.com/opencontainers/image-spec/releases/download/v1.0.1/oci-image-spec.-v1.0.1.pdf

Platform API version is not defined in the spec

The RFCs 10 and 11 describe a concept of Platform API version but we couldn't find any details within the spec. Particularly around compatibility. This can be found for Buildpack API version here but not Platform API version. Are we to assume that the same rules of compatibility apply?

[RFC 0076] Require `CNB_{USER,GROUP}_SID` instead of `CNB_{USER,GROUP}_ID` on windows stack images

RFC 0076
buildpacks/rfcs#133

Replace CNB_USER_ID and CNB_GROUP_ID with CNB_USER_SID and CNB_GROUP_SID on windows stack images
Replace -uid and -gid flags with -usid and -gsid flags for builds in windows environments

Motivating context: buildpacks/lifecycle#343

02/17/2021 - Updated to reflect the result of the RFC above

Idea for major (breaking) change: merge cache and launch directories

Note: This proposal has been edited. Comments below it may be outdated. See the edit history for previous versions.

After reviewing the various initial buildpack implementations, I've observed these patterns, anti-patterns, and limitations:

Layers with similar contents tend to be created in both the cache and launch directories.
Buildpacks tend to build abstractions that persist these layers to either or both of the directories depending on whether they are needed by buildpack code or by app code.
Buildpacks use conventions in the build plan to indicate whether a layer should be provided to other buildpacks via the cache or made available at launch-time (see #22).
When a layer is present in both cache and launch directories, buildpack code is often complex because of the different possible starting states of the cache, launch, and app directories (metadata vs. no-metadata, cached vs. not cached, vendored vs. not vendored).
In some cases, ensuring that entire layers from the last build are restored may be desirable (see #8).
Occasionally, buildpacks need to cache dependencies that shouldn't be accessible to other buildpacks. It seems reasonable to decouple these concepts.
Cached layers often have the same metadata as corresponding launch layers.
It is currently impossible for a /bin/build script to make a symlink from the app directory to a layer that remains unbroken during both build and launch. Separately, relative symlinks from layers to the app directory are easy to break when copying layer directories between cache and launch dirs.

Proposal

Instead of providing separate launch and cache directories to each buildpack, we could provide a single <layers> directory with the following extra, top-level fields in each <layer>.toml:

launch = false # layer is available at launch time (default: false)
build = false # layer is available to subsequent buildpacks (default: false)
cache = false # restore layer contents on the next build (default: false)
persist-cache = false # guarantee that layer contents are recovered (default: false, ignored if !cache)

[metadata] # all user-provided metadata now nested here

Rules:

Metadata written to <layer>.toml is always restored.
If launch && cache && !persist-cache, but the local layer contents do not match the remote layer contents, then the cached local layer is deleted before the build and the remote metadata is provided. This ensures that recovered local layers are never out of sync with remote layers.
If !launch && cache && persist-cache, the build fails unless/until we decide to support a persistent cache that isn't used in the remote image.
If a layer changes from launch to !launch, then the remote layer is deleted.
A platform may choose to cache layers locally when cache && persist-cache as long as the cached layers are only restored when they are identical to the remote launch layers.
To guarantee consistent behavior between builds, !build layers should always be moved such that they are inaccessible to subsequent buildpacks.

The combined <layers> directory would continue to look like this:

my.buildpack.id/my-layer/              # directory contents
my.buildpack.id/my-layer.toml          # metadata for my-layer
...

The new interface would be:
Executable: /bin/build <platform[AR]> <layers[EI]>, Working Dir: <app[AI]>

Input	Description
`/dev/stdin`	Build plan from detection (TOML)
`<platform>/env/`	User-provided environment variables for build
`<platform>/#`	Platform-specific extensions

Output	Description
[exit status]	Success (0) or failure (1+)
`/dev/stdout`	Logs (info)
`/dev/stderr`	Logs (warnings, errors)
`<layers>/launch.toml`	Launch metadata (see File: launch.toml)
`<layers>/<layer>.toml`	Layer content metadata
`<layers>/<layer>/bin/`	Binaries for subsequent buildpacks and/or launch
`<layers>/<layer>/lib/`	Shared libraries for subsequent buildpacks and/or launch
`<layers>/<layer>/profile.d/`	Scripts sourced by bash before launch
`<layers>/<layer>/include/`	C/C++ headers for subsequent buildpacks
`<layers>/<layer>/pkgconfig/`	Search path for pkg-config for subsequent buildpacks
`<layers>/<layer>/env/`	Env vars for launch/build, set before env.build or env.run
`<layers>/<layer>/env.build/`	Env vars for set for subsequent buildpacks
`<layers>/<layer>/env.run/`	Env vars for set before profile.d scripts are sourced
`<layers>/<layer>/*`	Other content for subsequent buildpacks and/or launch

READ THIS: New Behavior Introduced:

Clearing the cache when the local layer contents don't match the remote layer contents means that "stale" launch layers are never re-used. For example, this means that a Node.js build that jumps back and forth between two VMs would sometimes need to rebuild the node modules from scratch, even if they are cached on both VMs. This means that buildpacks need to perform objectively fewer metadata comparisons for the same effect (not just less copying). However, it is less efficient. I'm okay with this behavior change, because the previous behavior is easy to replicate with two layers if necessary (and it requires the exact same logic). In addition, this new behavior is safer and more similar to the current v2a/b globally-persistent buildpack cache behavior. Another way to understand this change is "locally-recovered launch layers must always match the previous build."
persist-cache requires downloading layers from the registry before the build, but provides a guaranteed, globally-persistent cache.
The <layers>/<layer>/env/ directory is split into <layers>/<layer>/env/build/ and <layers>/<layer>/env/run/ to make the behavior of the directory more clear and to provide a safer, more convenient way to set runtime environment variables.

Advantages:

Buildpack code would be less complex
Paths to the same dependency would always be the same
Symlinks from the app dir to the layer would remain valid for build and launch
Relative symlinks from the layers to the app dir would remain valid for build and launch
Local disk usage would be reduced by as much as 50%
Possibly safer due to guarantee that data intended for launch always matches the previous build

Disadvantages:

Slightly more TOML writing needed (when providing dependencies to subsequent buildpacks)
More complicated for the lifecycle to implement
Less efficient when a "cached-launch" layer is used instead of a separate cache and launch layers because "stale" launch layers are never re-used
Less efficient if buildpacks use persist-cache due to registry downloading

Possible Extensions:

We could allow !launch && cache && persist-cache in the future using a dedicated image repository (or tag in the same repository). This would be entirely backwards-compatible when introduced.

Thoughts? If desired, we should make this change before ratifying v3.0.0 of the specification.

@nebhale @jkutner @hone @ekcasey @dgodd @jchesterpivotal @ameyer-pivotal

Remove top-level version from `bom` entries in `io.buildpacks.build.metadata`

Consistent with the removal of version from the build plan in #97, we should remove .bom[].version from https://github.com/buildpacks/spec/blob/platform-spec-rewrite/platform.md#iobuildpacksbuildmetadata-json in platform API 0.4.

Blocked on #87

buildpacks / spec Goto Github PK

spec's Introduction

Cloud Native Buildpacks Specification v3

Notational Conventions

Key Words

Operating System Conventions

Interpreting Paths for Windows

Sections

API Versions

spec's People

Contributors

Stargazers

Watchers

Forkers

spec's Issues

Alternatives

Notes

Related issues

Prior Art

Overview

Use Case: Maven & JVM

Example

Use Case: Ruby & Node.js

Example

Open Questions

Interface Changes

Proposal

Recommend Projects

Recommend Topics

Recommend Org