Goal
We (especially @jameswhite and I) have need of an efficient, (dependency-) lightweight single-purpose tool to facilitate "branch deploys", and "branch diffs" for continuing to develop "on-system" software.
This differs from the various ways and tools to deploy (micro-)services and web apps to cloud platforms, kubernetes, etc., for which there are many CI/CD tools and processes.
"On-system?"
By "on-system" software I mean that we are often managing the operating-system level files on a host machine. For instance, files in /etc
, configurations of things like LDAP servers, Nagios monitoring, TFTP/PXE configurations, etc.
"Host?"
By "host" we mean a computer, which could be bare metal, a virtual machine (for instance in a VMWare instantiation), a container, a remote cloud-hosted instance. We expect the host to have a running OS and kernel, most often running Linux, but configuration may be minimal.
"source host?", "target host?"
In the timeline below we use these terms to refer to how this deployment tool operates. The "source host" is either a CI instance or some other host which is initiating a deployment, which triggers a branch deployment and/or branch diff onto a "target host". The "target host" is the host where our changes are being deployed (or proposed).
"bd?"
The working name for the command-line tool which is invoked to do branch deployments is bd
. This could change depending on whims.
"branch deploys?", "branch diffs?"
[Old man voice:] Back when we worked at GitHub, we deployed software (web applications, supporting tooling, host configurations, eventually network configurations, etc.) using a specific process:
- Develop changes via Pull Request on a new git branch
- As changes are pushed to GitHub, CI runs to validate the changes
- When changes need testing, "branch deploy" them (deploy that git branch of code to either a "branch lab" staging environment, or, eventually) to the production host(s) in question
- If this deployment proves workable (via various testing and telemetry means) then the Pull Request is merged, and the deployment becomes the new production "mainline" deployment
This is often colloquially referred to as the "GitHub Flow", especially if these operations are undertaken via ChatOps.
Over time, for changes which were less "web application software" and more likely to be systems level software (OS configurations, authorization configurations, network changes), the "branch deploy" phase was often preceded by:
- Report on the proposed changes, on real hosts, for this branch. This would take the form of a diff, hence "branch diff".
This process, taken together with the technique of separating the deployment and setup of a systems level tool (LDAP, puppet, nagios, vault, or systems such as Entitlements), from the data contents of that tool -- each of which can have its own branch deployment lifecycle -- is sometimes colloquially called the "KP Flow", after Kevin Paulisse who refined and spread this pattern during his time at GitHub.
Development constraints
- The source host needs to support the language environment (ruby in this case) for the
bd
command-line tool.
- The target host for a deployment needs a modern-ish shell (bash, zsh, dash, etc.) to run the deployment commands;
git
to clone a repository, andrsync
to do tree diffing and deployment. It does not need to support the (ruby) language environment.
- We can use whatever tools and libraries to test and develop, but the software that actually runs on the source host to trigger deployments should use the ruby standard library and nothing else.
- Testing should be full integration testing if at all possible. That is, no unit testing; definitely no mocking/stubbing tests. We should be able to use containerization to support fully sandbox real deployments during testing.
- It is not uncommon for configurations for multiple hosts to be managed from a single repository, so support specifying a path into the repository to act as a base path for the files deployed for a specific host.
- One of the prime targets for system management is the
/etc
path, and typically there are specific subsets of files being specifically managed, while other files are not under management. This points to being able to specify inclusion paths for managed files, probably with a configuration file available in the repository.
- I did something like this before, over a decade ago, but experience, platforms, tools, taste, etc., have all evolved. One of the same constraints applies: deployments should be absurdly fast. This probably means that we build a single shell command to handle the entire deployment and send it in one shot.
Timeline