A template for a lightweight distributed CI/CD server stack.
Note
|
This code is no longer being updated on Github. It is hosted on the proto-replicant network. Message me or create an issue for more informaiton. |
This code sets up a CD/CI server stack which includes
-
Distributed Storage (WIP)
-
K3S w/ Docker CE (transition to podman is upcomming)
-
Pods!
-
Raft DAG Auth (WIP)
-
go-ipfs
-
MySQL
-
Lighttpd
-
Kallithea
-
Kanboard
-
A pipeline (WIP)
-
API Messaging Server (WIP)
-
The aim is to produce a distributed, lightweight but complete software development pipeline server that will go on anything from an RPi3B+ to a powerful VM or Bare Metal. With this, you could use a few RPi3B+ which you share with friends, and host your code repo cloud with a URL reachable via IPFS for those with whom you have shared the IPFS URI. Cluster capability increases linearly with the number of nodes, however, each node is realistically limited to about 5 people on an RPi 3B+.
Your reasons to use this are your own. As for me: I have a strong disdain and distrust for centralized systems, especially those supported by a profit motive. The owner of GitHub in particular tends to Embrace, Enhance, and Extinguish. In this case, they are using your code hosted here (license permitting) to train AI to write code. This behavior is, in general, a bad idea—honestly, it’s a bad idea at many levels.
TLDR; If you have experienced loss and live in "Western" society, you likely have unresolved psychological damage which will (imperceptibly to you) alter the behavior of the AI you are training, and either in the short term or generations from now, it will end with massive human casualties. This is true even if you are a kind-hearted person with goals of bringing peace to humanity. (Death is a kind of peace, the subconscious is a weird beast.)
This is particularly poignant with General Purpose Artificial Intelligence, but the fact remains that anything that learns behaviors (or any stimulus-response pattern) is biased by the teacher with innumerable nuanced influences from the environment. This means that if you are not a completely enlightened individual…one without pain tied to your past, nor fear of the future, and able to remain present without tuning out in the present, then you are NOT qualified to train an AI (esp. GPAI). Your biases (fear, hurts, expectations, desires) will color the data samples and your interpretation of results. Combine this with the imprecision of spoken (and most written) languages, and a formula for discord and potentially cataclysmic disaster has been applied. The effects of this are far-reaching, touching every person who knows of someone who knows of someone who researches (GP)AI. On "simpler" ML systems, the effects of these influences only become noticeable via emergence—that is when many such systems are operating in their disparate roles to accomplish a common goal. That said, the importance of the point I’m making here should not be downplayed. If AI work is your goal, please consider carefully and proceed with caution regardless of how innocuous or "good" you think your work is.
-
Ubuntu 21.10+ as the host Linux distro.
-
an additional drive to use for the various data types hosted by this device.
Note
|
Git can be adapted so that it emulates distributed behavior. That configuration is left up to you as I do not use Git unless I have a client requirement for it. |
Mercurial 6.x (Python; GPL 2+)
Kallithea (Python, GPL 3)
Requirements
-
Base server should have podman in its repositories and run on a Raspberry Pi and support v2 cgroups, fully.
-
Reliable high volume write capable storage of >= 256GiB, named "REPLICANT2'
-
sudo
group must be passwordless
Note
|
Storage will be carved as: Single GPT partition (if you skip this, you may be sorry later), carve with lvm2 |
-
2 GiB /vg_data/lv_templates ←- docker/helm templates
-
10 GiB /vg_data/lv_backup ←- config backup via borg
-
20% remaining /vg_data/local ←- local pipeline space, pod configs
-
80% remaining /vg_data/shared ←- data to be shared (public repos, etc)
Note
|
Run ansible-galaxy collection install community.general to meet the requirements to run the install playbook.
|
Ansible (Python; GPL 2.0) NOTE: Considering Saltstack as it may be more complete for this use case.
-
RPi 3B+ (or newer)
-
16GB microSD for the OS
-
USB attached drive of at least 64GiB for Docker persistence which includes the IPFS store which holds configs and the git repo.
If you have some Raspberry Pi 3B/3B+ lying around, use those. This is sufficient for up to 5 people working on a project unless there is a fair amount of C/C++/Go/Rust to compile.
Please be responsible and do not use this to manage Java. You should actively discourage the use of Java. Java eats brains and poisons the water supply1.
1 This is an unverified claim. Of course, if you are using Java you may not be able to verify the claim because it has been eating your brain.
No, seriously. Actively discourage the use of Java. It was designed for set-top boxes. Adapting appliance code to run in the enterprise is like adapting Crocs for business wear. There are several fundamental flaws in the JVM (that I am legally bound not to disclose) which make it inherently insecure such that a complete re-design and re-write is required to resolve. Such a measure will break backward compatibility and so will likely never happen.
-
Hosting a VCS via an IPFS cluster might present some problems with concurrency due to latency over the Internet; looking for resolutions. In theory Mercurial + IPFS’s Merkle DAG should resolve, but I am REALLY good at finding corner cases where things break horrifically.
-
Arch is a popular distro. Need to learn Ansible better so I can support Arch as well if it meets the baseline requirements.