polyvariant / pitgull Goto Github PK

View Code? Open in Web Editor NEW

16.0 16.0 3.0 778 KB

Automatic housekeeping for your gitlab repositories.

License: Other

Scala 91.46% Dhall 8.22% Java 0.31%

cats-effect dhall fs2 gitlab scala scala-steward sttp tapir

pitgull's People

Contributors

Stargazers

Watchers

Forkers

scala-steward majk-p pkowalcze

pitgull's Issues

Define config file schema

I think we could use dhall (https://github.com/travisbrown/dhallj in Scala) as a configuration format - we should give users the ability to verify their config on their machines, ideally without defining a web UI for experiments.

Some research needs to be done on the features of dhall (or alternatives), including unions, enums etc.

Example config file (pseudocode):

//squash | merge | rebase
merge_strategy = squash // This might be possible to get from the repo settings
linear = true //requires rebasing if we're behind main branch

//allOf could be an infix `&&` operator instead
rules = [
  "Scala Steward is merged on green" -> allOf(
    author -> equalTo("[email protected]"),
    jobs -> exists(name -> 
      allOf(equalTo("test"), status(success))
    )
  )
]

Ideally we could have these stored in a single place and included per project, or they could be inherited by projects from the groups (or organizations in github).

The simpler approach seems to be including an external file, so something like:

include(
  uri = "[email protected]:pitgull/pitgull",
  ref = "main" //or specific git ref
)

or passing direct URLs to the files would be okay. This will also involve authentication, we'll probably be able to reuse the token given to the application during login.

Update: dhall allows including files by URL: let JSON = https://prelude.dhall-lang.org/v11.1.0/JSON/package.dhall

Dhallj requires a http4s client to resolve external imports, so I believe we can pass authentication header in the form of middleware and still be able to access gitlab resources (by http(s), so we'll probably need to link the raw files).

Cats Effect 3 migration checklist

Pending:

odin: WIP valskalla/odin#273, pending release (snapshot: 0.11.0+3-19c38e47+20210402-1813-SNAPSHOT)
sttp:
~~tapir: WIP softwaremill/tapir#1154~~
prox: WIP vigoo/prox#217, released
http4s

Add method to Gitlab that allows forcing MR approval

The current way to do this seems to be https://docs.gitlab.com/ee/api/merge_request_approvals.html#delete-merge-request-level-rule - it requires having the ID of the approval rule, and we can get these for a MR by using https://docs.gitlab.com/ee/api/merge_request_approvals.html#get-configuration-1.

If there are any project-wide rules, I think we can't really do anything to remove them from an individual MR. <- needs verification

As a temporary solution, we can use
Use https://docs.gitlab.com/ee/api/merge_request_approvals.html#change-configuration and just set the required approvals to 0.

MVP - bi-directional communication with gitlab

To start any work on running this as a service, we need the following:

A webhook endpoint that will consume events from gitlab
A call to the gitlab API, e.g. getting the list of open merge requests for a known project

We should keep compatibility with other git hosts in mind, but for a start this should be enough.

Scala 3 migration checklist

Needs removing *[_] usage of kind-projector, as dotty doesn't support it (* works though)
better-monadic-for needs investigating, possibly we can live without it

Releases for dependencies (that I see):

Dhall decoding tests

It would be nice to verify that the Scala code reading the dhall-generated JSON is actually able to do that.

Bootstrap: don't create webhook if one already exists

Read project config from within the project

One of the original goals of the project was to allow every repository that uses pitgull to have a customizable config for it. Dhall was picked as the format, one of the reasons being that it supports remote imports.

With that in mind, the approximate setup could look like this: one would have a single repository with a shared config, and multiple other repositories could import from that, like in example.dhall:

let pg =
      https://raw.githubusercontent.com/pitgull/pitgull/v0.0.2/dhall/pitgull.dhall sha256:65a46e78c2d4aac7cd3afeb1fa209ed244dc60644634a9cfc61800ea3417ea9b

let wms =
      https://gitlab.com/kubukoz/demo/-/raw/db4686f29bab1bc056ec96307a39aa3dd6337173/wms.dhall sha256:4b9218b9a1a83262550b9bdfa7d7250f4aa365b8d8c2131f65517ef5f3eeb68c

in pg.projectToJson { rules = [ wms.scalaSteward ] }

Here, we have an import of the pitgull "standard library" and a shared config.

One obstacle here is authentication - to resolve the shared import, we might need to authenticate to the GitLab instance being used.

While we do have a token that should allow doing this, it is not clear whether we can make the dhall resolver (dhall-json in our case, until #116 is possible - in dhallj we could have a custom http4s client that will attach the header in the worst case) forward it in the appropriate header.

Note that we will need to fetch the base config file using the GitLab API - so we don't need to do anything more complex than the existing implementations of API calls do.

Hopefully, once we have that fetched (as a byte stream should be enough), we can pass it as input to dhall-json executed with the appropriate environment variable, which can then be forwarded with the using clause of an import: https://docs.dhall-lang.org/references/Built-in-types.html?highlight=headers#keyword-using

Create endpoint to preview actions that would run for a project

This should probably call out to StateResolver and ProjectActions.compile (the latter might benefit from becoming a method in the ProjectActions algebra itself).

Ideally this will return a list of MRs for a project, along with either the actions that would be taken for each MR, or the errors that make an action inelegible.

Bonus points if we can make this endpoint accept a Dhall string body that will be decoded into the project configuration, so that users will be able to test their config without deploying it - but this can be made into another issue for the future.

Change synchronous backend in bootstrap to async one

#264 introduces a synchronous backend for communicating with gitlab. It should either be changed to async one, or wrapped in F to eliminate Identity from Gitlab.sttpInstance signature

Consider using decline for command line arguments

When bkirwi/decline#293 is fixed, consider using it in bootstrap cli

Show mismatches in preview endpoint

Drop sttp 2

After the changes from ghostdogpr/caliban#748 are released, we should upgrade and drop the remaining dependencies on sttp2.

Add healthcheck endpoint

It should return a 200 OK with literally anything.

Having an endpoint like that will help us ensure we don't miss any events between deployments. Another thing we can do further down the line is store received events in an external queue, so that we don't only rely on ephemeral state in memory.

Implement `MatcherFunction.compileMatcher`

Current state:

@autoContravariant
trait MatcherFunction[-In] {
  def matches(in: In): Matched[Unit]
}

//...

val compileMatcher: Matcher => MatcherFunction[MergeRequestState] = _ => isSuccessful

The compileMatcher function ignores every part of the MergeRequestState, except for the pipeline's status.

It should support all kinds of the Matcher ADT, including the Many node which requires all its children to match in order to match itself. An empty list in Many should mean "always match" (in the future we could consider making it a NEL instead of a list).

The rule currently used is the one defined in ProjectConfigReader.test: Rule.mergeAnything:

val mergeAnything = Rule("anything", Matcher.Many(Nil), Action.Merge)

it should probably be renamed to something like mergeSuccessful and contain a predicate based on the pipeline's status.

The compileMatcher function should be relatively easy to test. One set of test cases should definitely use the Scala Steward config defined in ProjectConfigReader.test (it can be copy-pasted verbatim). To get sample data, I recommend looking for Scala Steward MRs in the public Gitlab instance (via the graphql API, whose graphiql UI is linked in the readme) or at work ;)

New architecture

The current approach to handling webhooks is fundamentally flawed. We're handling every kind of event differently (e.g. we could find the pipeline by MR or the MR by pipeline), but the only things we care about is whether something of significance happened - a MR was opened, a commit was made, a pipeline changed status. The handling of significant events should be identical at all times.

A very important reason why that has to be the case is that events in one MR may trigger actions in other MRs - for example, merging MR #1 may result in rebasing MR #2, which will trigger a pipeline, whose change of status to successful should trigger a merge of MR #2.

I think webhook events should only provide information about which project was impacted. Currently the hooks that come to mind are Pipeline, MR and Push hooks (Build is redundant as we need to wait for a pipeline anyways). All webhook events are supported now, as long as project is specified. We fetch all open MRs and try to find actions for each of them.

For each such project, we will:

Ensure it's supported by pitgull (for now this isn't strictly necessary, but later we'll see if it has a configuration file and optionally if it's registered if we're launching to the public)
Get all of its merge requests
If a MR matches a rule, put that MR on a per-project global queue. In another fiber, changes to that queue (or MR-related triggers) will be watched and an attempt to merge exactly one MR will be made. If the project is configured to require rebasing other MRs, we can then rebase the next suitable MR in sequence. If we don't need to rebase before merging others - we should probably wait for the master pipeline somehow, or re-trigger the next MR's pipeline. TODO.

Prepare an example project with many MRs from scala steward

It can be a Scala project with just the basic sbt setup and some dependencies that need to be updated (e.g. cats 2.0.0, cats-effect 2.0.0, zio 1.0.0, http4s 0.21.0)

After it's set up on the public gitlab, we need to run Scala Steward with it to generate some merge requests (and keep the command used to do so for future testing with a running pitgull instance).

Decouple MR State calculation from kind of event

The logic that handles building MR State objects should not be aware of what kind of event has happened. This means we won't be able to use the pipeline ID, and will have to start from the list of all merge requests in the given project instead.

As part of this issue, we can start handling all events identically and remove the distinction in the webhook event model.

Add `shouldBeRebased`, `conflicts` flags to MergeRequestState

shouldBeRebased will be true if semi-linear history is enabled and there've been changes on master since the fork.

conflicts will be true if there are conflicts and the MR can't automatically be rebased - we can't do anything with such a MR.

Example query to see this flag:

query {
  projects(ids:"gid://gitlab/Project/20190338"){
    nodes{
      id
      mergeRequests(state:opened) {
        edges {
          node {
            id
            shouldBeRebased
            conflicts
          }
        }
      }
    }
  }
}

Add swagger

Hide graphQL from API of `Gitlab[F]`

Currently, Gitlab#mergeRequests accepts a selection: SelectionBuilder[MergeRequest, A]. It should be refactored to make the actual query inside the instance, and to expose a F[List[Foo]] instead of a parametric, client-driven A.

Accept MRs via graphQL API

There's a relatively new mutation in the GraphQL API for gitlab.

We should check if it's already available at the self-hosted GitLab instance and implement Gitlab[F] to use that instead of the REST API.

Add method in Gitlab to rebase a MR onto its target branch

This isn't included in to graphQL API either, so we'll need to use the HTTP API:

https://docs.gitlab.com/ee/api/merge_requests.html#rebase-a-merge-request

The method should mirror the path (just projectId and mergeRequestIid should be required):

PUT /projects/:id/merge_requests/:merge_request_iid/rebase

Rebasing may fail, in which case we should extract the merge_error field and raise it as an error - in the future, this might be caught by the caller and used to determine whether we should try to rebase the next MR in the queue.

Use dhallj

~~In #6, we found an issue with dhallj's handling of unions: travisbrown/dhallj#185~~ not an issue

~~A workaround is being used (dhall-to-json CLI tool), but ideally we could switch to dhallj at some point.~~

~~Here's the issue to track that effort.~~

We should move to dhallj+circe for handling our Dhall integration. That should make it easier to handle #189, too.

Add AnyOf, Not nodes in dhall config

Currently, we allow merging matchers with AllOf, which requires all of the matches to pass in order for a rule to be applied - similarly, we should have an AnyOf matcher that only requires one valid match. This should be added in both the dhall files, the JSON config classes in pitgull, and the matcher compiler.

Also, we should have a node that inverts any conditions passed.

Hide token in http4s loggers

Private-Token isn't one of the headers that's automatically hidden - the Logger middleware accepts a function that determines which headers to redact, so that one should also be hidden that way.