elastisys / compliantkubernetes-apps Goto Github PK

Elastisys Compliant Kubernetes is an open source, Certified Kubernetes distribution designed according to the ISO27001 controls: providing you with security tooling and observability from day one.

Home Page: https://elastisys.io/compliantkubernetes/

License: Apache License 2.0

Shell 49.99% Smarty 23.89% Open Policy Agent 5.52% Dockerfile 0.83% Mustache 10.56% HCL 0.16% Go 4.68% Makefile 0.48% Python 1.48% Jsonnet 0.34% JavaScript 2.09%

compliantkubernetes-apps's Introduction

Elastisys Compliant Kubernetes Apps

Overview

This repository is part of the Compliant Kubernetes (compliantkubernetes) platform. The platform consists of the following repositories:

compliantkubernetes-kubespray - Code for managing Kubernetes clusters and the infrastructure around them.
compliantkubernetes-apps - Code, configuration and tools for running various services and applications on top of Kubernetes clusters.

The Elastisys Compliant Kubernetes (compliantkubernetes) platform runs two Kubernetes clusters. One called "service" and one called "workload".

The service cluster provides observability, log aggregation, private container registry with vulnerability scanning and authentication using the following services:

Prometheus and Grafana
OpenSearch and OpenSearch Dashboards
Harbor
Dex

The workload cluster manages the user applications as well as providing intrusion detection, security policies, log forwarding and monitoring using the following services:

Falco
Open Policy Agent
Fluentd
Prometheus

This repository installs all the applications of ck8s on top of already created clusters. To setup the clusters see compliantkubernetes-kubespray. A service-cluster (sc) or workload-cluster (wc) can be created separately but all of the applications will not work correctly unless both are running.

All config files will be located under CK8S_CONFIG_PATH. There will be four config files: common-config.yaml, wc-config.yaml, sc-config.yaml and secrets.yaml. See Quickstart for instructions on how to initialize the repo

☁️ Cloud providers ☁️

Currently we support the following cloud providers:

AWS
Azure
Citycloud/Cleura
Elastx
Exoscale
Openstack
Safespring
UpCloud
In addition to this we support running Compliant Kubernetes on bare metal (beta).

Setup

The apps are installed using a combination of helm charts and manifests with the help of helmfile and some bash scripts.

🔧 Requirements 🔧

To operate compliantkubernetes-apps some tools need to be installed. They are declared in the file REQUIREMENTS as PURLs.

Install the requirements to use compliantkubernetes-apps:

./bin/ck8s install-requirements

Note that you will need a service and workload cluster.

Developer requirements and guidelines

See DEVELOPMENT.md.

🔐 PGP 🔐

Configuration secrets in ck8s are encrypted using SOPS. We currently only support using PGP when encrypting secrets. Because of this, before you can start using ck8s, you need to generate your own PGP key:

gpg --full-generate-key

Note that it's generally preferable that you generate and store your primary key and revocation certificate offline. That way you can make sure you're able to revoke keys in the case of them getting lost, or worse yet, accessed by someone that's not you.

Instead create subkeys for specific devices such as your laptop that you use for encryption and/or signing.

If this is all new to you, here's a link worth reading!

Usage

Quickstart

You probably want to check the compliantkubernetes-kubespray repository first, since compliantkubernetes-apps depends on having two clusters already set up. In addition to this, you will need to set up the following DNS entries (replace example.com with your domain).

There are two options when managing DNS records, manually or ExternalDNS.

Manually point these domains to the workload cluster ingress controller:
- *.example.com
Manually point these domains to the service cluster ingress controller:
- *.ops.example.com
- dex.example.com
- grafana.example.com
- harbor.example.com
- opensearch.example.com

Assuming you already have everything needed to install the apps, this is what you need to do.

The other option is to let ExternalDNS manage your DNS records, currently only AWS Route 53 is supported. You configure ExternalDNS later in the process.

Decide on a name for this environment, the cloud provider to use as well as the flavor and set them as environment variables: Note that these will be later kept as global values in the common defaults config to prevent them from being inadvertently changed, as they will affect the default options of the configuration when generated or updated. To change them remove the common defaults config, set the new environment variables, and then generate a new configuration.
```
export CK8S_ENVIRONMENT_NAME=my-ck8s-cluster
export CK8S_FLAVOR=[dev|prod|air-gapped] # defaults to dev

#
# If 'none', no infra provider tailored configuration will be performed!
#
export CK8S_CLOUD_PROVIDER=[exoscale|safespring|citycloud|elastx|upcloud|azure|aws|baremetal|openstack|none]
export CK8S_K8S_INSTALLER=[kubespray|capi] # set this to whichever installer was used for the kubernetes layer
```
[!NOTE] The air-gapped flavor has a lot of the same settings as the prod flavor but with some additional variables that you need to configure yourself (these are set to set-me).
Then set the path to where the ck8s configuration should be stored and the PGP fingerprint of the key(s) to use for encryption:
```
export CK8S_CONFIG_PATH=${HOME}/.ck8s/my-ck8s-cluster
export CK8S_PGP_FP=<PGP-fingerprint1,PGP-fingerprint2,...>
```
Initialize your environment and configuration: Note that the configuration is split between read-only default configs found in the defaults/ directory, and the override configs common-config.yaml, sc-config.yaml and wc-config.yaml which are editable and will override any default value. The common-config.yaml will be applied to both the service and workload cluster, although it will be overridden by the any value set in the sc-config.yaml or wc-config.yaml respectively. When new configs are created this will generate new random passwords for all services. When configs are updated this will not overwrite existing values in the override configs. It will create a backup of the old override configs placed in backups/, generate new default configs in defaults/, merge common values into common-config.yaml, and clear out redundant values set in the override configs that matches the default values. See compliantkubernetes.io if you are uncertain about what order you should do things in.
```
./bin/ck8s init both
```
[!NOTE] It is possible to initialize wc and sc clusters separately by replacing both when running the init command:
```
./bin/ck8s init wc
./bin/ck8s init sc
```
Edit the configuration files that have been initialized in the configuration path. Make sure that the objectStorage values are set in common-config.yaml or sc-config.yaml and wc-config.yaml, as well as required credentials in secrets.yaml according to your objectStorage.type. The type may already be set in the default configuration found in the defaults/ directory depending on your selected cloud provider. Set objectStorage.s3.* if you are using S3 or objectStorage.gcs.* if you are using GCS. Enable ExternalDNS externalDns.enabled and set the required variables, if you want ExternalDNS to manage your records from inside your cluster. It requires credentials to route53, txtOwnerId, endpoints if externalDns.sources.crd is enabled.
Create S3 buckets - optional If you have set objectStorage.type: s3, then you need to create the buckets specified under objectStorage.buckets in your configuration files. You can run the script scripts/S3/entry.sh create to create the buckets required. The script uses s3cmd in the background and it uses the ${HOME}/.s3cfg file for configuration and authentication for your S3 provider. There's also a helper script scripts/S3/generate-s3cfg.sh that will allow you to generate an appropriate s3cfg config file for a few providers.
```
# Use your s3cmd config file.
scripts/S3/entry.sh create

# Use custom config file for s3cmd.
scripts/S3/generate-s3cfg.sh aws ${AWS_ACCESS_KEY} ${AWS_ACCESS_SECRET_KEY} s3.eu-north-1.amazonaws.com eu-north-1 > s3cfg-aws
scripts/S3/entry.sh --s3cfg s3cfg-aws create
```

Test S3 configuration - optional If you enable object storage you also need to make sure that the buckets specified in objecStorage.buckets exist. You can run the following snippet to ensure that you've configured S3 correctly:

(
  access_key=$(sops exec-file ${CK8S_CONFIG_PATH}/secrets.yaml 'yq r {} "objectStorage.s3.accessKey"')
  secret_key=$(sops exec-file ${CK8S_CONFIG_PATH}/secrets.yaml 'yq r {} "objectStorage.s3.secretKey"')
  sc_config=$(yq m ${CK8S_CONFIG_PATH}/defaults/common-config.yaml ${CK8S_CONFIG_PATH}/defaults/sc-config.yaml ${CK8S_CONFIG_PATH}/common-config.yaml ${CK8S_CONFIG_PATH}/sc-config.yaml -a overwrite -x)
  region=$(echo ${sc_config} | yq r - 'objectStorage.s3.region')
  host=$(echo ${sc_config} | yq r -  'objectStorage.s3.regionEndpoint')

  for bucket in $(echo ${sc_config} | yq r -  'objectStorage.buckets.*'); do
      s3cmd --access_key=${access_key} --secret_key=${secret_key} \
          --region=${region} --host=${host} \
          ls s3://${bucket} > /dev/null
      [ ${?} = 0 ] && echo "Bucket ${bucket} exists!"
  done
)

Note, for this step each cluster need to be up and running already. Deploy the apps:
```
./bin/ck8s apply sc
./bin/ck8s apply wc
```
Test that the cluster is running correctly with:
```
./bin/ck8s test sc
./bin/ck8s test wc
```
You should now have a fully working environment. Check the next section for some additional steps to finalize it and set up user access.

On-boarding and final touches

If you followed the steps in the quickstart above, you should now have deployed the applications and have a fully functioning environment. However, there are a few steps remaining to make all applications ready for the user.

User access

After the cluster setup has completed RBAC resources and namespaces will have been created for the user. You can configure what namespaces should be created and which users that should get access using the following configuration options in wc-config.yaml:

user:
  namespaces:
    - demo1
    - demo2
  adminUsers:
    - [email protected]
    - [email protected]"

A kubeconfig file for the user (${CK8S_CONFIG_PATH}/user/kubeconfig.yaml) can be created by running the script bin/ck8s kubeconfig user. The user kubeconfig will be configured to use the first namespace by default.

OpenSearch Dashboards access for the user can be provided either by setting up OIDC or using the internal user database in OpenSearch:

OIDC:

Set opensearch.sso.enabled=true in sc-config.yaml.

Configure extra role mappings under opensearch.extraRoleMappings to give the users the necessary roles.

extraRoleMappings:
  - mapping_name: kibana_user
    definition:
      users:
        - "configurer"
        - "User Name"
  - mapping_name: kubernetes_log_reader
    definition:
      users:
        - "User Name"

Internal user database:
- Log in to OpenSearch Dashboards using the admin account.
- Create an account for the user.
- Give the kibana_user and kubernetes_log_reader roles to the user.

Users will be able to log in to Grafana using dex, but they will have read only access by default. To give them more privileges, you need to first ask them to log in (so that they show up in the users list) and then change their roles.

Harbor works in a multi-tenant way so that each logged in user will be able to create their own projects and manage them as admins (including adding more users as members). However, users will not be able to see each others (private) projects (unless explicitly invited) and won't have global admin access in Harbor. This also naturally means that container images uploaded to these private registries cannot automatically be pulled in to the Kubernetes cluster. The user will first need to add pull secrets that gives some ServiceAccount access to them before they can be used.

For more details and a list of available services see the user guide.

Harbor HA - work in progress

It is possible to run harbor in HA mode. This section describes the necessary configuration needed to setup harbor in HA mode. More information about harbor ha can be found here.

Both Postgres and Redis needs to be external, as harbor does not handle HA deployment of postgres and redis. It is up to the operator to set these up in a HA mode.

Postgres requirements

The following list is requirements on the external postgres

Password encryption: none or md5
- scram-sha-256 is not supported.
Initial empty databases must be created before harbor starts
- registry

Config:

Harbor backup is not designed to work with a external database. You will have to provide your own backup solution.

In $CK8S_CONFIG_PATH/sc-config.yaml set the following configs

harbor:
  ...
  backup:
    enabled: false
  database:
    type: external
    external:
      host: "set-me"
      port: "5432"
      username: "set-me"
      # "disable" - No SSL
      # "require" - Always SSL (skip verification)
      # "verify-ca" - Always SSL (verify that the certificate presented by the
      # server was signed by a trusted CA)
      # "verify-full" - Always SSL (verify that the certification presented by the
      # server was signed by a trusted CA and the server host name matches the one
      # in the certificate)
      sslmode: "disable"

In $CK8S_CONFIG_PATH/secrets.yaml add the postgres user password

harbor:
  external:
    databasePassword: set-me

Also configure network policies to access database

networkPolicies:
    database:
      internal:
        ingress:
          peers: []
      externalEgress:
        peers:
          - namespaceSelectorLabels:
              kubernetes.io/metadata.name: postgres-system
            podSelectorLabels:
              cluster-name: harbor-cluster
        ports:
          - 5432

Redis

Config:

In $CK8S_CONFIG_PATH/sc-config.yaml set the following configs

harbor:
  redis:
    type: external
    external:
      addr: "rfs-redis-harbor.redis-system:26379"
      sentinelMasterSet: "mymaster"

Also configure network policies to access redis

networkPolicies:
    redis:
      internalIngress:
        peers:
          - namespaceSelectorLabels:
              kubernetes.io/metadata.name: redis-system
            podSelectorLabels:
              app.kubernetes.io/name: redis-harbor
        ports:
          - 26379
          - 6379

Capacity Management

For capacity management, compliantkubernetes-apps comes with some Prometheus alerts and a Grafana dashboard, which facilitate monitoring on a per Node as well as Node Group basis. The Node Group is meant to represent a logical grouping of Nodes, e.g., worker and control-plane. As such, in order to make use of these you first have to label your nodes with elastisys.io/node-group=<node-group>, for example:

kubectl label node <node-name> elastisys.io/node-group=<node-group>

Management of the clusters

The bin/ck8s script provides an entry point to the clusters. It should be used instead of using for example kubectlor helmfile directly as an operator. To use the script, set the CK8S_CONFIG_PATH to the environment you want to access:

export CK8S_CONFIG_PATH=${HOME}/.ck8s/my-ck8s-cluster

Run the script to see what options are available.

Examples

Deploy apps to the workload cluster:
```
./bin/ck8s apply wc
```
Run tests on the service cluster:
```
./bin/ck8s test sc
```

Port-forward to a Service in the workload cluster:

./bin/ck8s ops kubectl wc port-forward svc/<service> --namespace <namespace> <port>

Run helmfile diff on a helm release:

./bin/ck8s ops helmfile sc -l <label=selector> diff

Autocompletion for ck8s in bash

Add this to ~/.bashrc:

CK8S_APPS_PATH= # fill this in
source <($CK8S_APPS_PATH/bin/ck8s completion bash)

Upgrading compliantkubernetes-apps

The bin/ck8s script also provides commands to upgrade an environment in two steps prepare and apply. The former runs scripted configuration steps that do not change the state of the environment, while the latter runs scripted upgrade steps that modifies the state of the environment. On unexpected failures the command will try to perform a rollback when possible to ensure that the environment continues to function.

./bin/ck8s upgrade both vX.Y prepare
./bin/ck8s upgrade both vX.Y apply

Note

It is possible to upgrade wc and sc clusters separately by replacing both when running the upgrade command, e.g. the following will only upgrade the workload cluster:

./bin/ck8s upgrade wc vX.Y prepare
./bin/ck8s upgrade wc vX.Y apply

It is possible to upgrade from one minor version to the next regardless of patch versions (vX.Y -> vX.Y+1), and from one patch version to any later patch versions (vX.Y.Z -> vX.Y.Z+N). Version validation will require that you are on a release tag matching version specified in the command, and that your environment is at most one minor version behind. When on a specific commit add the commit hash under global.ck8sVersion to pass validation, and for development set any to circumvent version validation completely.

Removing compliantkubernetes-apps from your cluster

There are two simple scripts that can be used to clean up you clusters.

To clean up the service cluster run:

./scripts/clean-sc.sh

To clean up the workload cluster run:

./scripts/clean-wc.sh

Operator manual

See https://compliantkubernetes.io/operator-manual/.

Setting up Google as identity provider for dex

Go to the Google console and create a project.
Go to the Oauth consent screen and name the application with the same name as the project of your google cloud project add the top level domain e.g. elastisys.se to Authorized domains.
Go to Credentials and press Create credentials and select OAuth client ID. Select web application and give it a name and add the URL to dex in the Authorized Javascript origins field, e.g. dex.demo.elastisys.se. Add <dex url>/callback to Authorized redirect URIs field, e.g. dex.demo.elastisys.se/callback.
Configure the following options in CK8S_CONFIG_PATH/secrets.yaml
```
  dex:
    googleClientID:
    googleClientSecret:
```

Known issues

OpenSearch Dashboards Single Sign On (SSO) via OpenID/Dex requires LetsEncrypt Production.

For more, please check the public GitHub issues: https://github.com/elastisys/compliantkubernetes-apps/issues.

compliantkubernetes-apps's People

Contributors

Stargazers

Watchers

Forkers

dutchellie nlarzon jbygdell raviranjanelastisys moule3053 nx6110a5100 torldre

compliantkubernetes-apps's Issues

Init overwrites config

Running init (again) will override many config settings that you have set. It will more or less reset it to the defaults. There is no warning either.

Init should not overwrite existing config. If possible it should try to merge the defaults with already existing config and tell you when there are conflicts that would cause overwrites.

test

[1] OPA should enforce policies at least in production

Currently all OPA policies are set to dry-run. They should be enforced instead at least for the production flavor.

Blocked by elastisys/ck8s-apps#131 and #620

Opendistro prod flavor typo

Opendistro has an error in the production flavor that causes it to fail to install. The memory limit is set to 3GiMi, which should be changed to 3Gi.

User RBAC tests fail due to quotes

The following tests give false negatives (i.e. they fail even though everything is fine) if the user admin in the config is quoted:

get node not authorized ❌
get namespace not authorized ❌
create deployments in production not authorized ❌
delete deployments in production not authorized ❌
create deployments in staging not authorized ❌
delete deployments in staging not authorized ❌
patch configmaps/fluentd-extra-config in fluentd not authorized ❌
patch configmaps/fluentd-extra-plugins in fluentd not authorized ❌

The tests fail if the user is for example "[email protected]" but not for [email protected]. The issue is most likely due to the parsing of the namespaces and users in this script.

This may not be critical enough to require for v0.7.0, but if we decide to skip it, we should make sure that this known issue is documented!

Upgrade Harbor chart

In order to configure trust for Harbor to dex, this value is required.
The value allows a CA bundle to be injected into trust stores.

[3] Investigation: Move some compliantkubernetes-apps components to Kubernetes addons

kubespray advocates using addons for certain "system" Pods. See full list here.

This task consists in figuring out what is better to keep in compliantkubernetes-apps and what should be moved "down" into kubespray.

Acceptance criteria:

The pros and cons of Kubernetes addons vs. Helm Charts is known.
We have a list of candidate components to move as Kubernetes addons.

Config validation doesn't check what version is actually used

Validation of config doesn't really check what version you use. Instead it just picks the latest tag, no matter how far back in history that is. This is very dangerous since it looks like someone running the latest bleeding edge non-release are actually on a supported tag in the config.

It would be better to clearly warn the user when not running supported release. It must still be possible to run non-tagged versions of course for development. In these cases the version could be set to the current commit sha or simply dev.

Bonus: Don't just check what commit is checked out, but also if the directory is "clean" (i.e. are there any non-committed changes). If there are, add +dirty to the version to indicate this.

Suggested fix: If no tag is available, set the version to dev and print a warning for the user.

Elasticsearch backups are broken

Configurer and elasticsearch-backup does not agree on the s3 snapshot repository name. This breaks the backups!

Note that if we change the name of the repository, we must check how this affects upgrades. Maybe it would be best to keep the old name?

Update script shebang to allow user to define a custom bash

When running the scripts the binary /bin/bash is used in some places. If the user wants to use another installed bash version (ex if there are multiple installed versions) it should be allowed so that the first bash executable in the users $PATH is used and not a hard coded one. This is a big problem if the user is running macOS, openBSD or some linux distros.

The scripts are also not consistent since some of them use #!/bin/bash and some #!/usr/bin/env bash.

Solution:

Change all shebangs from #!/bin/bash to #!/usr/bin/env bash

Something like this would fix it: COMMIT

It should be possible to set resources for cert-manager

[5] multi-tenancy: config path assuming exactly 2 clusters

Our current config/state setup assumes that we are always using exactly 2 clusters (service, workload).

If we are going to support multiple workload clusters per service cluster we need to restructure our scripts to handle files (config files, kubeconfigs, infra.json, etc) from multiple workload clusters in the same config path.

We must discuss how to best setup this.

Investigate what to do with deprecated stable/dex helm chart

This chart is deprecated https://github.com/helm/charts/tree/master/stable/dex and it does not contain a link to a new home.

What to do?

Fork and maintain it ourselves
...

Replace deprecated stable/elasticsearch-exporter helm chart

It seems as if the stable/elasticsearch-exporter helm chart does not have a new home
I think it make sense to tackle this issue by bundling it in the opendistro helm chart, and I am also slightly positive that this is somehting that we can upstream.

WIP - need to add action points

Rework version validation

./bin/ck8s performs version validation as follows:

It retrieves the script's version by performing git describe --tags --abbrev=0 HEAD | sed 's/^v//'
It retrieves the configuration version by looking at global.ck8sVersion
It mandates that the two be equal.

This is problematic for two reasons:

compliantkubernetes-apps might be checked out shallow or might be included as a submodule, hence its own version may be unreliable.
Version mismatch should be allowed according to Semantic Versioning rules.

It should be possible to add nginx config to the ingress controller

Multi-tenancy (multiple workload clusters of same user sharing service cluster)

Acceptance criteria:

Can create a new workload cluster and connect that with an existing service cluster
Can have more than one workload cluster sharing the same service cluster.
No hard multi-user multi-tenancy, but assume that the same identity provider is used for all clusters.

[x] #62

Foreseen updates (TODO: create more issues from this list):

Config
[x] #85

Service cluster:

Deprecated: [x] Cluster installation: elastisys/ck8s-cluster#102

Harbor
Some issues related to root vs. less privileged credentials
Some credential transfer issues (cannot pull from the Harbor instance by default?) ?
Multiple repos in a Harbor instance, and default config (of kubeadm)? - need some additional config here in the general case (nothing multi-tenant specific)

Elastic
Some pre-fixing of all indices, etc. to distinguish clusters
Alternatively, add cluster-name in all variables (push logs to same index, but label … )

Grafana
Using pre-baked dashboards from operator (not sure if these support multi-cluster out of the box)
WE did something similar before, rather straightforward, using one data-source per cluster??
Pre-fixing?
Number of workload clusters hardcoded to 1?

InfluxDB (or other Grafana backing storage...)
Splitting per Prometheus instance, likely through pre-fixing in the same way as done for Elastic

Dex
No changes needed?

Workload cluster:
Fluentd
What endpoint to forward logs to?
Add some tags (include cluster-id as label)
Some credentials and endpoints

It should be possible to configure resources for velero and restic

[2] Upgrade to kubernetes v1.19

With 1.20 coming up we should support 1.19

Maybe we already do. So this task will be to investigate what needs to be changed to support 1.19 and create new issues if there are any bigger changes that is needed.

Relevant info:
https://kubernetes.io/docs/setup/release/notes/
(might be this link when 1.20 is released) https://v1-19.docs.kubernetes.io/docs/setup/release/notes/

Acceptance criteria

Test to deploy onto a kubernetes 1.19 cluster
Fix the things that doesn't work
Create issues if there are any bigger changes needed

Fix mistakes in influxdb.yaml.gotmpl and add back backup retention

Lists in default config can cause validation errors

We do some validation of the users config by checking that all config options are present in the user config files.

However if we have lists in the default config, then the validation will fail if the user has less items in their list.
E.g. if we have this default config:

dex:
  allowedDomains:
    - example.com
    - elastisys.com

And the user has this config:

dex:
  allowedDomains:
    - elastisys.com

Then they will get this error [ck8s] ERROR: dex.allowedDomains.[1] is not set in <config-path>/sc-config.yaml, which is not something we want.

I'm not sure how to best solve this and it becomes more complex if we ever would like to have some lists with more complex data that we actually want to validate. A first step would perhaps be to skip validations against anything inside a list, but check that the list key exists (allowedDomains in the example above).

Opsgenie heartbeat configuration inconsistency

In sc-config.yaml under alerts.opsGenieHeartbeat there is both enable and enabled. When running with the production flavor one is set to true and the other to false.

There should only be one option for enabling the heartbeat and it should be enabled by default for production.

Elasticsearch slm job uses wrong scehdule variable

It is just the case that the value here has been indented https://github.com/elastisys/compliantkubernetes-apps/blob/main/helmfile/values/elasticsearch-slm.yaml.gotmpl#L5
It should be in root

Configure retention for security-auditlog indices in Opensearch

Add a line to the default curator retention configuration to remove security-auditlog-* after 14 days or 1gb.

Broken links and missing documentation

There are broken links to the user and operator access documentation here. These should probably be replaced with links to the relevant sections of https://compliantkubernetes.io.

The old on-boarding documentation is also missing which means that there is no guide for how to finish an environment and hand over user credentials. This includes at least the following:

Create a kibana user with proper permissions
Generate a user kubeconfig
Instructions for how to give the user more privileges in grafana (ask them to log in first, then change the role once they appear in the user list)

[2] Upgrade to kubernetes v1.18

With 1.20 coming up we should support 1.18

Maybe we already do. So this task will be to investigate what needs to be changed to support 1.18 and create new issues if there are any bigger changes that is needed.

Relevant info:
https://v1-18.docs.kubernetes.io/docs/setup/release/notes/

Acceptance criteria

Test to deploy onto a kubernetes 1.18 cluster
Fix the things that doesn't work
Create issues if there are any bigger changes needed

Make config validation smarter

The current validation requires all configration parameters to be present in the configuration file.
Consider the scenario where a user has set opa.enabled: false, harbor.enabled: false, our current implementation still requires all available variables for harbor and opa to be in the users configuration file.

I would suggest that the validation is made smarter, whereby it skips additional checks if enabled flags for applications are set to false.

[3] Users elasticsearch dashboard in grafana shows errors and no data

The elasticsearch dashboard in the user grafana is empty and shows errors about the datasource. I think this dashboard should not be available to the user. It should only be in the Ops grafana.

Safespring: Impossible to push images to Harbor

This is because of no support for redirects in the S3 implementation.
The redirect is already disabled on Citycloud (swift) but we need to disable it on Safespring as well.

Relevant config: https://github.com/elastisys/compliantkubernetes-apps/blob/main/helmfile/values/harbor.yaml.gotmpl#L43

[5] Expose InfluxDB and write to it from the workload cluster

To remove the service cluster's dependency on the workload cluster, one proposed solutions is to push metrics from the workload cluster directly to InfluxDB.

Expose InfluxDB.
Create write-only user for the workload cluster's prometheus.
Update promehtues-wc-scraper to only read from InfluxDB, remove federation config.

It should be possible to assign velero and restic to certain nodes

Relax configuration validation

To allow for better configuration options we need to temprarily relax our current configration validation until we have time to improve it.
One solution could be to warn the user that there might be something wrong in their confiuration and give them the option to abort and double check their configuration.

[1] Upgrade Helm to v3.4.0+

get-requirements.yaml
README.md
pipeline/Dockerfile
more?

Required for new version of Grafana chart in PR #7

Investigation: Speed up ./bin/ck8s apply wc|sc

Helmfile is currently configure to pull Helm repositories, which:
(a) Take a lot of time
(b) Makes the script less robust (relies on external Helm Chart repo)

This task consists in investigating if its possible to include the Helm Charts as git submodules, alternatively, be copied into compliantkubernetes-apps.

Acceptance criteria:

We have a solution for including Helm chart in compliantkubernetes-apps
We can demonstrate time reduction (e.g., ./bin/ck8s apply takes more time before than after)

Rename nginx release

It appears as if we did not change the name of the relase when we moved to the new nginx helm chart.
I am all for making the change from nginx-ingress to ingress-nginx, so basically replace everything in the release specification: https://github.com/elastisys/compliantkubernetes-apps/blob/main/helmfile/01-applications.yaml#L48-L58

Cause right now we get these ugly names of nginx-ingress-ingress-nginx...

Cert-manager issuer should be configurable for user-alertmanager

Investigate: Separate service cluster from workload cluster

In order to achieve workload multi-tenancy (i.e., several workload clusters, one service cluster) and simplify setting up Compliant Kubernetes cluster, it should be possible to completely set up a service cluster without knowing anything about the workload cluster. Currently, the service cluster pushes metrics from the workload cluster.

This tasks consists in identifying all blockers to making the service cluster independent from the workload cluster.

Acceptance criteria:

We know all references to the workload cluster from the service cluster.
We identified what needs to be done to make the service cluster independent.
We maintain the status of "tamper-proof logging environment" of the service cluster.

It should be possible to add additional and custom ISM policies and index templates

Falco dashboard in Grafana Ops is empty

Either disable the dashboard for Ops (i.e. include it only for users) or fix it.

Standalone cluster for development

It would be very nice to be able to create a cluster for development purposes with all the correct policies and restrictions that comes with a full blown ck8s environment. This would be useful not just for internal development but also for customers that want development or test environments that mimics ck8s. Additionally, it would be a nice proof that ck8s-cluster works as a standalone project/product!

I have come up with the following things that would be needed for this:

~~- Get rid of DNS, it is not useful for ck8s-cluster on its own. elastisys/ck8s-cluster-old#76~~
~~- Get rid of S3 buckets, they are not useful for a bare cluster. elastisys/ck8s-cluster-old#75~~
Fixed by kubespray: ~~- Move Gatekeeper to ck8s-cluster, or create instructions for how to set it up here manually.~~

Configure a "customer user" with limited privileges and create a kubeconfig for this user?
Fixed by kubespray: ~~- Move cert-manager to ck8s-cluster? It is a pretty fundamental application, maybe it could be considered part of the cluster?~~
Prometheus? It may be needed to test and develop prometheus metrics and alerts, how can we support this?

Remove ops grafana oidc configuration

At the moment, the ops grafana instance is configured to enable login via oidc.
However, this is not configured properly in dex.

Until we have a decision/solution for how to handle oidc for the ops grafana instance, the broken oidc configuration in grafana should be removed.

It should be possible to deploy user-alertmanager in custom namespace

[2] Add cluster indentifier to logs

For multi-tenancy, we need to be able to associate each log entry with a specific workload cluster.
Part of #58

Support GCS as object storage provider

Part of elastisys/ck8s-cluster#100

To be able to support GCP we need to be able to support to use a separate objectstorage than s3 (gcs)

Acceptance criteria

Make it possible to choose what object storage provider to use
Make sure that all apps to support gcs
Configure the apps correctly depending on what object storage provider you choose

It should be possible to assign cert-manager to certain nodes

user-alertmanager helm chart is not found

When user.alertmanager.enabled: true it fails to install the alertmanager helm chart.

STDERR:
  Error: Failed to render chart: exit status 1: Error: failed to download "charts/examples/user-alertmanager" at version "0.1.0" (hint: running `helm repo update` may help)
  Error: plugin "diff" exited with error

COMBINED OUTPUT:
  ********************
  	Release was not present in Helm.  Diff will show entire contents as new.
  ********************
  Error: Failed to render chart: exit status 1: Error: failed to download "charts/examples/user-alertmanager" at version "0.1.0" (hint: running `helm repo update` may help)
  Error: plugin "diff" exited with error

Reason: it tries to fin charts/examples/user-alertmanager but it is actually called charts/examples/customer-alertmanager

Create sample roles: developer & admin

Create two sample role bindings, corresponding to typical access for a developer (may deploy to workload cluster) and admin (full access to workload cluster and service cluster, including Harbor, Kibana, etc.). These roles are for illustration purpose only, and are not prescriptive for installation and configuration of compliant kubenetes.

Depends on https://github.com/elastisys/ck8s/issues/444

Harbor: #97
Grafana: #99
Kubernetes: #100
Kibana: #98

Upgrade ingress-nginx

We are running an old version of ingress-nginx and we should consider upgrading to a newer version.

Upgrade helm chart version
Remove tag from helmfile values (if possible)
Update the values file accordingly.

Upgrade cert-manager

We are currently running v0.14.1 of cert-manager while v1.1.0 is already available at the time of writing this. Needless to say, it would be great for us to be on the GA v1 release and of course to have the latest and greatest.

Cert-manager has excellent upgrade documentation, so please check it for details. We will probably need to upgrade one minor version at a time unless we want a breaking change so this issue could be split up to have one for each upgrade if we want to.

elastisys / compliantkubernetes-apps Goto Github PK

compliantkubernetes-apps's Introduction

Elastisys Compliant Kubernetes Apps

Overview

☁️ Cloud providers ☁️

Setup

🔧 Requirements 🔧

Developer requirements and guidelines

🔐 PGP 🔐

Usage

Quickstart

On-boarding and final touches

User access

Harbor HA - work in progress

Postgres requirements

Redis

Capacity Management

Management of the clusters

Examples

Autocompletion for ck8s in bash

Upgrading compliantkubernetes-apps

Removing compliantkubernetes-apps from your cluster

Operator manual

Setting up Google as identity provider for dex

Known issues

compliantkubernetes-apps's People

Contributors

Stargazers

Watchers

Forkers

compliantkubernetes-apps's Issues

Acceptance criteria

Acceptance criteria

Acceptance criteria

Recommend Projects

Recommend Topics

Recommend Org