justintime50 / harvey Goto Github PK

View Code? Open in Web Editor NEW

12.0 5.0 3.0 13.38 MB

The lightweight Docker Compose deployment runner.

Home Page: https://github.com/Justintime50/harvey-ui

License: MIT License

Python 97.23% Dockerfile 0.11% Just 2.66%

harvey ci cd docker deployment compose lightweight runner

harvey's Introduction

Hey, I'm Justin Hammond 👋

Senior Software Engineer @EasyPost, IT Pro, Tech Enthusiast

I love all things tech. I've been programming for 18+ years, tinkering with electronics for 15+ years, and founding or building tech companies for 10+ years. I'm an open source fanatic, Apple fanboy, and love to explore new tech. I spend my time coding open source projects, tinkering with electronics and new tech products, and consulting teams on how to get things done.

Noteworthy Projects

The following are items that may not be represented on my GitHub profile but are noteworthy in the software space:

Crazy Color Clash: My iOS game has available on the App Store! Download Crazy Color Clash for free on the App Store today! Visit https://crazycolorclash.com for more details.
Magic Mirror: I built a Magic Mirror which combines the best of software and hardware. Read all about it.
I've been designing 3D printable models lately and am slowly uploading my creations to Thingiverse. Check them out here.

GitHub Stats

Latest Blog Posts

harvey's People

Contributors

Stargazers

Watchers

Forkers

gurneesh peterkeen

harvey's Issues

Add Healthcheck After Deploys

Harvey never actually checks to ensure your container is running, instead it simply tries to run a container.

There should be functionality to allow Harvey to do a healthcheck on the container prior to marking a deploy/full pipeline successful. This will ensure it didn't start up and immediately exit.

Cache vs No Cache

Ensure we have the right mix of cache vs no-cache. It's imperative that builds are unique each time and pull in data properly, but it's also important for speed we cache what we can without having residue from previous runs

Volume Support on Container Create

Currently you cannot specify volumes when creating a container. Add that functionality.

Add Linting

Harvey needs to be linted. This needs to happen though after bug fixes and stability improvements.

Add Pipeline ID's and allow their logs to be retrieved

Pipelines come and go currently without a way to reference them or tie them to the logs that were output. Store a pipeline ID->log object that can be retrieved.

Switch to Using Docker-Py Instead of Shelling out for Docker Commands

Summary

Initially, Harvey was built as a client library around the Docker API and then the pipelines were built over that. There already exists a much more robust client library for Docker which we should use instead called docker-py.

Acceptance Criteria

Switch containers and images to use the Docker client library
Ensure that we still get timeouts on actions taken like we do right now with the subprocess module
Ensure the docker package is in the setup.py file

Related Issues

#23
#20
#19
#15
TODO item found in images.py: Use the Docker API for building instead of a shell command (PR #43)

Add Gunicorn as WSGI Server

We need a production ready WSGI server sitting in front of the Flask app to properly serve the API, let's use Gunicorn to do so: https://github.com/benoitc/gunicorn

Look Into Better Multiprocessing

Replace Thread with multiprocessing in app.py for better performance?

Speed Improvements to Stage Execution Times

The Build stage is INCREDIBLY fast - wahoo!
The Test stage needs help. Because we are building unique images every single time with unique tags and unique containers - nothing can get cached. This is great for security and containing each persons tests to a unique container but is terrible on performance. Some scripts that only take 1-3 seconds to run are having their test stage take 15-20 seconds as all the overhead is on Docker.
The Deploy Stage is fine, spinning up the container isn't the problem - it comes with tearing down the old one. Many containers will wait the default 10 seconds before stopping pending a "graceful shutdown" but many of the projects configured don't have a graceful shutdown in place and therefore simply wait the 10 default seconds. Killing containers immediately may not be a great idea because some projects will have a graceful shutdown - we need to find a happy median where we can shut down containers ASAP. This may require a per-project config.

Add Authentication

Anyone can hit an endpoint and with the right data royally mess up Docker on the host machine. Protect Harvey with per-user authentication. Ensure that only the user who created a project in Harvey can touch it in anyway and ensure that those with no authentication cannot use Harvey at all.

Pipeline timers do not account for the "startup" time

Pipeline times start when the pipeline starts but pipelines start after Harvey boots up and clones/pulls the project so there is some unaccounted time on hand. Pass the start time from the very beginning to the pipelines to get an accurate reading.

Add logging to Harvey

Summary

Harvey should log errors and stack traces for troubleshooting. Pair this with #7

Simply adopt the Logging package from Python to do the trick.

https://realpython.com/the-most-diabolical-python-antipattern/

Acceptance Criteria

Add logging throughout the app experience allowing us insight to what's happening with Harvey (keep this separate from request logging) - use standard Python logging
Add logging on the endpoints (requests) so we know what's coming in (keep this separate from app logging) - use Flask logging
Allow logging to be configurable by the end user
Keep logging of pipelines separate from the two items above
Ensure all types of logs (requests, application, pipeline) rollover once the file or files become to big or many

Add Support for Parallel Tests

Users may want to test their code across various versions of a programming language. Add support for parallel tests to be run (eg: PHP 7.2, 7.3, 7.4).

Criteria:

We'll need to ensure there is a limit on how many parallel tests any single user can run at a time.
We'll need to string all the outputs together or associate them with that user

Network Support on Docker Containers

Currently you cannot specify any networks on docker containers. Add this functionality.

Fix All TODO Items

Fix ALL the TODO items found throughout the project

Lock Does Not Exist for Project That is Brand New

If you try to deploy a brand new project to Harvey, the pipeline will blow up stating there is no lock and will exit.

Exception in thread Thread-25:
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/Users/admin/git/personal/harvey/venv/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py", line 69, in run
    reraise(*_capture_exception())
  File "/Users/admin/git/personal/harvey/venv/lib/python3.9/site-packages/sentry_sdk/_compat.py", line 54, in reraise
    raise value
  File "/Users/admin/git/personal/harvey/venv/lib/python3.9/site-packages/sentry_sdk/integrations/threading.py", line 67, in run
    return old_run_func(self, *a, **kw)
  File "/opt/homebrew/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/admin/git/personal/harvey/harvey/pipelines.py", line 94, in run_pipeline
    webhook_config, webhook_output, start_time = Pipeline.initialize_pipeline(webhook)
  File "/Users/admin/git/personal/harvey/harvey/pipelines.py", line 28, in initialize_pipeline
    if Lock.lookup_project_lock(Webhook.repo_full_name(webhook)) is True:
  File "/Users/admin/git/personal/harvey/harvey/locks.py", line 42, in lookup_project_lock
    raise ValueError('Lock does not exist!')
ValueError: Lock does not exist!

Let's fix this so it gracefully handles if no lock can be found.

Run Pipeline only on Master branch commits

Add logic to webhooks to only build on the master branch commits (possibly based on the refs attribute in the GitHub json?)

Use ssh_url for "git clone"

If we don't use ssh_url for git clone then private repos can't be cloned into Harvey. We'll need to:

Change the reference
Document using ssh keys
Create a helper script to try automating that?

Add Subprocess Timeouts

Currently, subprocesses can run forever and not exit. Add a timeout to each subprocess to ensure they don't hang or spin forever.

Security Audit & Authentication

Security Audit

Add authentication to all API endpoints and ensure only those with a valid API key can perform actions
Ensure users can only interact with their own projects

See #1

Change "cd" to "git -C"

Harvey currently navigates to the project directory then pulls. Instead, Harvey should simply pull changes relatively using git -C <dir> pull instead of cd <dir> && git pull which is safer and simpler.

Add "Pull" Pipeline

Currently if you only wanted to pull changes when a webhook fires, you'd need to test as well. Add a new pipeline that literally just runs git pull when a webhook is received. Great if people wanted to use custom logic to test/deploy their repo.

Add Database

Currently logs are stored to actual log files with the name of the pipeline ID which is just a randomly generated ID.

Structure:

logs
    project_1
        1234567890.log
        0987654321.log
    project_2
        ...
    ...

This is great for now, but we should instead be saving log data to a database.

Brainstorming some initial columns and tables:

logs

id [int]
pipeline_id [int]
log_content [text]
created_at [datetime]

users

id [int]
user_id [int]
email [varchar]
password [varchar]
created_at [datetime]
updated_at [datetime]
deleted_at [datetime]

pipelines

id [int]
pipeline_id [int]
success [bool]
configuration [text]? (store the JSON configuration)
user_id [int] (user who created the pipeline... how is this derived?)
pipeline_time [varchar] (time the total pipeline took to build)
created_at [datetime]

Add Email Support

Add support to email when pipelines are finished in addition to the Slack logic that already exists.

Container Healthchecks Are Randomly Failing

Container health checks are randomly failing and I'm unsure why (see recent Slack output for more info). Let's investigate and correct this. Logging will be helpful to determine why.

Do Not Allow Multiple Concurrent Pipelines for the Same Project

Docker Compose does weird things when you start two concurrent docker compose up -d commands for the same project (only possible because they are running in separate threads via Harvey) ultimately leading to Docker crashing completely.

Let's add a check prior to running a pipeline that will lock deployments for that project and only release it once the pipeline is finished (success or fail) so that we don't inadvertently break Docker.

Ingest environment variables into the container

Currently there is no way to ingest environment variables into a container when created which is a huge missing piece of the puzzle.

Logger not Working When Running via Gunicorn

When you run the app with make prod, it appears that items are not being logged at all, even to file (they are hidden from console as well).

We need to ensure the when run via gunicorn that logging takes place. If you run make run, the logger will work properly.

Read into this for a solution: https://trstringer.com/logging-flask-gunicorn-the-manageable-way/

Dangling Images/Containers

Ensure that we aren't leaving dangling images or containers lying around eating up resources and disk space

Failed Tests Should Exit with Code 1

Ensure that tests fail with code exit 1 so that the build phase/deploy isn't triggered

Whitelist GitHub IPs?

Self-hosting Harvey, I'm getting bombarded by random requests that aren't coming from me or GitHub. One potential option would be to whitelist GitHub's IP's. There is an endpoint that can be used to constantly update the list of IP's; however, they advise against whitelisting IPs.

https://docs.github.com/en/free-pro-team@latest/github/authenticating-to-github/about-githubs-ip-addresses

Add dotenv to requirements

Print and Save Partial Logs on Stage/Pipeline Failure

Currently if a pipeline fails, it just fails. There is no log or console output. CI/CD should show logs regardless of if it failed or not.

Label Support on Container Create

Currently you cannot specify labels when creating a container. Add that functionality.

Docker-in-Docker Support (For Tests)

Figure out a good way to implement a docker-in-docker concept (which yes, is tricky and bad - but people may want to test their docker containers)

Better User Logging

Better logging (include output from everything in the log, not just container logs meaning the build output and each steps output)

Also add error handling across the board.

Replace Flask with WSGI Server

Currently a flask router wraps all the routes to the API/webhooks which is fine for development and small personal use but will not scale in production environments. Replace the Flask app.py with a production ready WSGI server.

This will depend on #10

Update Documentation

Summary

The documentation is sad and has been, the project has changed so frequently that it's been difficult to document how to get started with Harvey; however, it's stabilizing and the docs could use a serious refresh to make it dead simple on how to get started with this project, especially because it's so large and requires a lot of info.

Acceptance Criteria

Update the documentation to describe better what Harvey is
Update the documentation to show how to use Harvey

Add the Ability to be Pinged on Slack when a Deployment Starts

Currently Harvey pings in Slack when a deployment ends but does not ever say one starts. This would be a great feature for those who may have multiple in-flight and want to keep track of them.

Fix Container Healthchecks for "Compose" Workflows

Currently, the container healthcheck functionality only works for non-compose workflows. As I solely use compose workflows in Harvey, it'd be great to get this working again.

Notes I had from a previous commit:

* Healthchecks currently fail for docker-compose deploys as the container name is specified in the compose files vs the webhook
* Fix healthchecks for compose and create a way we can line up the name in code with the name in files

Basically the health check is trying to run a healthcheck against a container whose name doesn't exist and it therefore fails.

Harvey Stopped Saving to SQLite DB

Harvey appears to have stopped saving to the SQLite DB in prod. I'm unsure yet as to the reason, it started about a week ago regardless of project. This could simply be due to the version of Harvey being deployed being a "nightly build" and having a bug. This will need some investigation.

The biggest offender is the pipeline logs. Harvey still builds and fires off the Slack notification though so it's only the saving (or retrieving?) of the logs.

Try/Catch Logic

Introduce try/catch logic to ensure each step of the process works correctly. Some of this has already happened which mostly stops bad things from happening but some of the errors aren't caught and there are still other places where there is no try/catch logic.

Add pylint-exit as a callable function

Add pylint-exit as a shell function that can be called upon in harvey.sh files (see Python test example for reference)

Fix Encoding

Fix latin encoding for logs which messes with output. Find something universal that is friendlier to all terminals and text editors.

Pipeline’s Need a Time-out

It’s possible that users can add a rogue script in the testing stage that runs forever tying up resources or that certain build stages could run too long. Add a timeout for each stage that will exit the pipeline if exceeded.

Flush Logs After Time Period

Harvey will quickly build up hundreds or thousands of log files. Need to add some logic to flush logs after a certain date or time period? Allow the user to configure this?

Change Default Git Pull Behavior

When pulling repos, you'll receive the following error:

warning: Pulling without specifying how to reconcile divergent branches is
discouraged. You can squelch this message by running one of the following
commands sometime before your next pull:

  git config pull.rebase false  # merge (the default strategy)
  git config pull.rebase true   # rebase
  git config pull.ff only       # fast-forward only

You can replace "git config" with "git config --global" to set a default
preference for all repositories. You can also pass --rebase, --no-rebase,
or --ff-only on the command line to override the configured default per
invocation.

To fix this, let's make all pulls fast forward.

justintime50 / harvey Goto Github PK

harvey's Introduction

Hey, I'm Justin Hammond 👋

Senior Software Engineer @EasyPost, IT Pro, Tech Enthusiast

Noteworthy Projects

GitHub Stats

Latest Blog Posts

harvey's People

Contributors

Stargazers

Watchers

Forkers

harvey's Issues

Summary

Acceptance Criteria

Related Issues

Summary

Acceptance Criteria

Security Audit

Summary

Acceptance Criteria

Recommend Projects

Recommend Topics

Recommend Org