Giter Club home page Giter Club logo

bottlerocket-update-operator's Introduction

Bottlerocket Update Operator

The Bottlerocket update operator (or, for short, Brupop) is a Kubernetes operator that coordinates Bottlerocket updates on hosts in a cluster. When installed, the Bottlerocket update operator starts a controller deployment on one node, an agent daemon set on every Bottlerocket node, and an Update Operator API Server deployment. The controller orchestrates updates across your cluster, while the agent is responsible for periodically querying for Bottlerocket updates, draining the node, and performing the update when asked by the controller. The agent performs all cluster object mutation operations via the API Server, which performs additional authorization using the Kubernetes TokenReview API -- ensuring that any request associated with a node is being made by the agent pod running on that node. Further, cert-manager is required in order for the API server to use a CA certificate to communicate over SSL with the agents. Updates to Bottlerocket are rolled out in waves to reduce the impact of issues; the nodes in your cluster may not all see updates at the same time.

For a deep dive on installing Brupop, how it works, and its integration with Bottlerocket, check out this design deep dive document!

Getting Started

Version Support Policy

As per our policy, Brupop follows the semantic versioning (semver) principles, ensuring that any updates in minor versions do not introduce any breaking or backward-incompatible changes. However, please note that we only provide security patches for the latest minor version. Therefore, it is highly recommended to always keep your Brupop installation up to date with the latest available version.

For example: If v1.3.0 is the latest Brupop release, then, v1.3 (latest minor version) will be considered as supported and v1.3.0 (latest available version) will be the recommended version of Brupop to be installed. When v1.3.1 is released, then that version will be considered as recommended.

Installation

  1. Brupop uses cert-manager to manage self-signed certificates used by the Bottlerocket update operator. To install cert-manager:
kubectl apply -f \
  https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml

Or if you're using helm:

# Add the cert-manager helm chart
helm repo add jetstack https://charts.jetstack.io

# Update your local chart cache with the latest
helm repo update

# Install the cert-manager (including its CRDs)
helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.8.2 \
  --set installCRDs=true
  1. The Bottlerocket update operator can then be installed using helm:
# Add the bottlerocket-update-operator chart
helm repo add brupop https://bottlerocket-os.github.io/bottlerocket-update-operator

# Update your local chart cache with the latest updates
helm repo update

# Create a namespace
kubectl create namespace brupop-bottlerocket-aws

# Install the brupop CRD
helm install brupop-crd brupop/bottlerocket-shadow

# Install the brupop operator
helm install brupop-operator brupop/bottlerocket-update-operator

This will create the custom resource definition, roles, deployments, etc., and use the latest update operator image available in Amazon ECR Public.

Alternatively, you can use the pre-baked manifest, with all the default values, found at the root of this repository (named bottlerocket-update-operator.yaml). This YAML manifest file is also attached to each release found in the GitHub releases page.

Note: The generated manifests points to the latest version of the Update Operator, v1.1.0. Be sure to use the release for the Update Operator release that you plan to deploy.

  1. Label nodes with bottlerocket.aws/updater-interface-version=2.0.0 to indicate they should be automatically updated. Only bottlerocket nodes with this label will be updated. For more information on labeling, refer to the Label nodes section of this readme.
kubectl label node MY_NODE_NAME bottlerocket.aws/updater-interface-version=2.0.0

Configuration

Configure via Helm values yaml file

You can use a values file when installing brupop with helm (via the --values / -f flag) to configure how brupop functions:

# Default values for bottlerocket-update-operator.

# The namespace to deploy the update operator into
namespace: "brupop-bottlerocket-aws"

# The image to use for brupop
# This defaults to the image built alongside a particular helm chart, but you can override it by uncommenting this line:
# image: "public.ecr.aws/bottlerocket/bottlerocket-update-operator:v1.4.0"

# Placement controls
# See the Kubernetes documentation about placement controls for more details:
# * https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
# * https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
# * https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity
placement:
  agent:
    # The agent is a daemonset, so the only controls that apply to it are tolerations.
    tolerations: []

  controller:
    tolerations: []
    nodeSelector: {}
    podAffinity: {}
    podAntiAffinity: {}

  apiserver:
    tolerations: []
    nodeSelector: {}
    podAffinity: {}
    # By default, apiserver pods prefer not to be scheduled to the same node.
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 1
          podAffinityTerm:
            labelSelector:
              matchExpressions:
              - key: brupop.bottlerocket.aws/component
                operator: In
                values:
                - apiserver
            topologyKey: kubernetes.io/hostname

# If testing against a private image registry, you can set the pull secret to fetch images.
# This can likely remain as `brupop` so long as you run something like the following:
# kubectl create secret docker-registry brupop \
#  --docker-server 109276217309.dkr.ecr.us-west-2.amazonaws.com \
#  --docker-username=AWS \
#  --docker-password=$(aws --region us-west-2 ecr get-login-password) \
#  --namespace=brupop-bottlerocket-aws
#image_pull_secrets: |-
#  - name: "brupop"

# External load balancer setting.
# When `exclude_from_lb_wait_time_in_sec` is set to a positive value
# brupop will exclude the node from load balancing
# and will wait for `exclude_from_lb_wait_time_in_sec` seconds before draining nodes.
# Under the hood, this uses the `node.kubernetes.io/exclude-from-external-load-balancers` label
# to exclude those nodes from load balancing.
exclude_from_lb_wait_time_in_sec: "0"

# Concurrent update nodes setting.
# When `max_concurrent_updates` is set to a positive integer value,
# brupop will concurrently update max `max_concurrent_updates` nodes.
# When `max_concurrent_updates` is set to "unlimited",
# brupop will concurrently update all nodes with respecting `PodDisruptionBudgets`
# Note: the "unlimited" option does not work well with `exclude_from_lb_wait_time_in_sec`
# option, which could potential exclude all nodes from load balancer at the same time.
max_concurrent_updates: "1"

# DEPRECATED: use the scheduler settings
# Start and stop times for update window
# Brupop will operate node updates within update time window.
# when you set up time window start and stop time, you should use UTC (24-hour time notation).
update_window_start: "0:0:0"
update_window_stop: "0:0:0"

# Scheduler setting
# Brupop will operate node updates on scheduled maintenance windows by using cron expressions.
# When you set up the scheduler, you should follow cron expression rules.
# ┌───────────── seconds (0 - 59)
# │ ┌───────────── minute (0 - 59)
# │ │ ┌───────────── hour (0 - 23)
# │ │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ │ ┌───────────── month (Jan, Feb, Mar, Apr, Jun, Jul, Aug, Sep, Oct, Nov, Dec)
# │ │ │ │ │ ┌───────────── day of the week (Mon, Tue, Wed, Thu, Fri, Sat, Sun)
# │ │ │ │ │ │ ┌───────────── year (formatted as YYYY)
# │ │ │ │ │ │ │
# │ │ │ │ │ │ │
# * * * * * * *
scheduler_cron_expression: "* * * * * * *"

# API server ports
# The port the API server uses for its own operations. This is accessed by the controller,
# the bottlerocket-shadow daemonset, etc.
apiserver_internal_port: "8443"
# API server internal address where the CRD version conversion webhook is served
apiserver_service_port: "443"

logging:
  # Formatter for the logs emitted by brupop.
  # Options are:
  # * full - Human-readable, single-line logs
  # * compact - A variant of full optimized for shorter line lengths
  # * pretty - "Excessively pretty" logs optimized for human-readable terminal output.
  # * json - Newline-delimited JSON-formatted logs.
  formatter: "pretty"
  # Whether or not to enable ANSI colors on log messages.
  # Makes the output "pretty" in terminals, but may add noise to web-based log utilities.
  ansi_enabled: "true"

  # Controls the filter for tracing/log messages.
  # This can be as simple as a log-level (e.g. "info", "debug", "error"), but also supports more complex directives.
  # See https://docs.rs/tracing-subscriber/0.3.17/tracing_subscriber/filter/struct.EnvFilter.html#directives
  controller:
    tracing_filter: "info"
  agent:
    tracing_filter: "info"
  apiserver:
    tracing_filter: "info"

# Provide pod level labels for the brupop resources.
podLabels: {}

Configuration via Kubernetes yaml

Configure API server ports

If you'd like to configure what ports the API server uses, adjust the value that is consumed in the container environment:

      ...
      containers:
        - command:
            - "./api-server"
          env:
            - name: APISERVER_INTERNAL_PORT
              value: 999

You'll then also need to adjust the various "port" entries in the YAML manifest to correctly reflect what port the API server starts on and expects for its service port:

    ...
    webhook:
      clientConfig:
        service:
          name: brupop-apiserver
          namespace: brupop-bottlerocket-aws
          path: /crdconvert
          port: 123

The default values are generated from the default values yaml file file:

  • apiserver_internal_port: "8443" - This is the container port the API server starts on. If this environment variable is not found, the Brupop API server will fail to start.
  • apiserver_service_port: "443" - This is the port Brupop's Kubernetes Service uses to target the internal API Server port. It is used by the node agents to access the API server. If this environment variable is not found, the Brupop agents will fail to start.
Resource Requests & Limits

The bottlerocket-update-operator.yaml manifest makes several default recommendations for Kubernetes resource requests and limits. In general, the update operator and its components are lite-weight and shouldn't consume more than 10m CPU (which is roughly equivalent to 1/100th of a CPU core) and 50Mi (which is roughly equivalent to 0.05 GB of memory). If this limit is breached, the Kubernetes API will restart the faulting container.

Note that your mileage with these resource requests and limits may vary. Any number of factors may contribute to varying results in resource utilization (different compute instance types, workload utilization, API ingress/egress, etc). The Kubernetes documentation for Resource Management of Pods and Containers is an excellent resource for understanding how various compute resources are utilized and how Kubernetes manages these resources.

If resource utilization by the brupop components is not a concern, removing the resources fields in the manifest will not affect the functionality of any components.

Exclude Nodes from Load Balancers Before Draining

This configuration uses Kubernetes ServiceNodeExclusion feature. EXCLUDE_FROM_LB_WAIT_TIME_IN_SEC can be used to enable the feature to exclude the node from load balancer before draining. When EXCLUDE_FROM_LB_WAIT_TIME_IN_SEC is 0 (default), the feature is disabled. When EXCLUDE_FROM_LB_WAIT_TIME_IN_SEC is set to a positive integer, bottlerocket update operator will exclude the node from load balancer and then wait EXCLUDE_FROM_LB_WAIT_TIME_IN_SEC seconds before draining the pods on the node.

To enable this feature, set the exclude_from_lb_wait_time_in_sec value in your helm values yaml file to a positive integer. For example, exclude_from_lb_wait_time_in_sec: "100".

Otherwise, go to bottlerocket-update-operator.yaml and change EXCLUDE_FROM_LB_WAIT_TIME_IN_SEC to a positive integer value. For example:

      ...
      containers:
        - command:
            - "./agent"
          env:
            - name: MY_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: EXCLUDE_FROM_LB_WAIT_TIME_IN_SEC
              value: "180"
      ...
Set Up Max Concurrent Update

MAX_CONCURRENT_UPDATE can be used to specify the max concurrent updates during updating. When MAX_CONCURRENT_UPDATE is a positive integer, the bottlerocket update operator will concurrently update up to MAX_CONCURRENT_UPDATE nodes respecting PodDisruptionBudgets. When MAX_CONCURRENT_UPDATE is set to unlimited, bottlerocket update operator will concurrently update all nodes respecting PodDisruptionBudgets.

Note: The MAX_CONCURRENT_UPDATE configuration does not work well with EXCLUDE_FROM_LB_WAIT_TIME_IN_SEC configuration, especially when MAX_CONCURRENT_UPDATE is set to unlimited, it could potentially exclude all nodes from load balancer at the same time.

To enable this feature, set the max_concurrent_updates value in your helm values yaml file to a positive integer value or unlimited. For example, max_concurrent_updates: "1" or max_concurrent_updates: "unlimited".

Otherwise, go to bottlerocket-update-operator.yaml and change MAX_CONCURRENT_UPDATE to a positive integer value or unlimited. For example:

      containers:
        - command:
            - "./controller"
          env:
            - name: MY_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: MAX_CONCURRENT_UPDATE
              value: "1"
Set scheduler

SCHEDULER_CRON_EXPRESSION can be used to specify the scheduler in which updates are permitted. When SCHEDULER_CRON_EXPRESSION is "* * * * * * *" (default), the feature is disabled.

To enable this feature, set the scheduler_cron_expression value in your helm values yaml file. This value should be a valid cron expression. A cron expression can be configured to a time window or a specific trigger time. When users specify cron expressions as a time window, the bottlerocket update operator will operate node updates within that update time window. When users specify cron expression as a specific trigger time, brupop will update and complete all waitingForUpdate nodes on the cluster when that time occurs.

# ┌───────────── seconds (0 - 59)
# | ┌───────────── minute (0 - 59)
# | │ ┌───────────── hour (0 - 23)
# | │ │ ┌───────────── day of the month (1 - 31)
# | │ │ │ ┌───────────── month (Jan, Feb, Mar, Apr, Jun, Jul, Aug, Sep, Oct, Nov, Dec)
# | │ │ │ │ ┌───────────── day of the week (Mon, Tue, Wed, Thu, Fri, Sat, Sun)
# | │ │ │ │ │ ┌───────────── year (formatted as YYYY)
# | │ │ │ │ │ |
# | │ │ │ │ │ |
# * * * * * * *

Note: brupop uses Coordinated Universal Time(UTC), please convert your local time to Coordinated Universal Time (UTC). This tool Time Zone Converter can help you find your desired time window on UTC. For example (schedule to run update operator at 03:00 PM on Monday ):

      containers:
        - command:
            - "./controller"
          env:
            - name: MY_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: SCHEDULER_CRON_EXPRESSION
              value: "* * * * * * *"
Set an Update Time Window - DEPRECATED

Note: these settings are deprecated and will be removed in a future release. Time window settings cannot be used in combination with the preferred cron expression format and will be ignored.

If you still decide to use these settings, please use "hour:00:00" format only instead of "HH:MM:SS".

UPDATE_WINDOW_START and UPDATE_WINDOW_STOP can be used to specify the time window in which updates are permitted.

To enable this feature, set the update_window_start and update_window_stop values in your helm values yaml file to a hour:minute:second formatted value (UTCE 24-hour time notation). For example: update_window_start: "08:0:0" and update_window_stop: "12:30:0".

Otherwise, go to bottlerocket-update-operator.yaml and change UPDATE_WINDOW_START and UPDATE_WINDOW_STOP to a hour:minute:second formatted value (UTC (24-hour time notation)).

Note that UPDATE_WINDOW_START is inclusive and UPDATE_WINDOW_STOP is exclusive.

Note: brupop uses UTC (24-hour time notation), please convert your local time to UTC. For example:

      containers:
        - command:
            - "./controller"
          env:
            - name: MY_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: UPDATE_WINDOW_START
              value: "09:00:00"
            - name: UPDATE_WINDOW_STOP
              value: "21:00:00"

Label nodes

By default, each Workload resource constrains scheduling of the update operator by limiting pods to Bottlerocket nodes based on their labels. These labels are not applied on nodes automatically and will need to be set on each desired node using kubectl. The agent relies on each node's updater components and schedules its pods based on their interface supported. The node indicates its updater interface version in a label called bottlerocket.aws/updater-interface-version. Agent deployments, respective to the interface version, are scheduled using this label and target only a single version in each.

For versions > 0.2.0 of the Bottlerocket update operator, only update-interface-version 2.0.0 is supported, which uses Bottlerocket's update API to dispatch updates. For this reason, only Bottlerocket OS versions > v0.4.1 are supported.

For the 2.0.0 updater-interface-version, this label looks like:

bottlerocket.aws/updater-interface-version=2.0.0

With kubectl configured for the desired cluster, you can use the below command to get all nodes:

kubectl get nodes

Make a note of all the node names that you would like the Bottlerocket update operator to manage.

Next, add the updater-interface-version label to the nodes. For each node, use this command to add updater-interface-version label. Make sure to change NODE_NAME with the name collected from the previous command:

kubectl label node NODE_NAME bottlerocket.aws/updater-interface-version=2.0.0

If all nodes in the cluster are running Bottlerocket and require the same updater-interface-version, you can label all at the same time by running this:

kubectl label node $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}') bottlerocket.aws/updater-interface-version=2.0.0

Automatic labeling via Bottlerocket user-data

You can automatically add Kubernetes labels to your Bottlerocket nodes via the settings provided in user data when your nodes are provisioned:

# Configure the node-labels Bottlerocket setting
[settings.kubernetes.node-labels]
"bottlerocket.aws/updater-interface-version" = "2.0.0"

Automatic labeling via eksctl

If you're using eksctl, you can automatically add labels to nodegroups. For example:

---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: bottlerocket-cluster
  region: us-west-2
  version: '1.17'

nodeGroups:
  - name: ng-bottlerocket
    labels: { bottlerocket.aws/updater-interface-version: 2.0.0 }
    instanceType: m5.large
    desiredCapacity: 3
    amiFamily: Bottlerocket

Uninstalling

If you remove the bottlerocket.aws/updater-interface-version=2.0.0 label from a node, the Brupop controller will remove the resources from that node (including the bottlerocketshadow CRD associated with that node).

Deleting the custom resources, definitions, and deployments will then fully remove the bottlerocket update operator from your cluster:

helm uninstall brupop
helm uninstall brupop-crd

Operation

Overview

The update operator controller and agent processes communicate using Kubernetes Custom Resources, with one being created for each node managed by the operator. The Custom Resource created by the update operator is called a BottlerocketShadow resource, or otherwise shortened to brs. The Custom Resource's Spec is configured by the controller to indicate a desired state, which guides the agent components. The update operator's agent component keeps the Custom Resource Status updated with the current state of the node. More about Spec and Status can be found in the Kubernetes documentation.

Additionally, the update operator's controller and apiserver components expose metrics which can be configured to be collected by Prometheus.

Observing State

Monitoring Custom Resources

The current state of the cluster from the perspective of the update operator can be summarized by querying the Kubernetes API for BottlerocketShadow objects. This view will inform you of the current Bottlerocket version of each node managed by the update operator, as well as the current ongoing update status of any node with an update in-progress. The following command requires kubectl to be configured for the desired cluster to be monitored:

kubectl get bottlerocketshadows --namespace brupop-bottlerocket-aws 

You can shorten this with:

kubectl get brs --namespace brupop-bottlerocket-aws 

You should see output akin to the following:

$ kubectl get brs --namespace brupop-bottlerocket-aws 
NAME                                               STATE   VERSION   TARGET STATE   TARGET VERSION
brs-node-1                                         Idle    1.5.2     Idle           
brs-node-2                                         Idle    1.5.1     StagedUpdate   1.5.2

Monitoring Cluster History and Metrics with Prometheus

The update operator provides metrics endpoints which can be scraped by Prometheus. This allows you to monitor the history of update operations using popular metrics analysis and visualization tools.

We provide a sample configuration which demonstrates a Prometheus deployment into the cluster that is configured to gather metrics data from the update operator.

To deploy the sample configuration, you can use kubectl:

kubectl apply -f ./deploy/telemetry/prometheus-resources.yaml

Now that Prometheus is running in the cluster, you can use the UI provided to visualize the cluster's history. Get the Prometheus pod name (e.g. prometheus-deployment-5554fd6fb5-8rm25):

kubectl get pods --namespace brupop-bottlerocket-aws 

Set up port forwarding to access Prometheus on the cluster:

kubectl port-forward $prometheus-pod-name 9090:9090 --namespace brupop-bottlerocket-aws 

Point your browser to localhost:9090/graph to access the sample Prometheus UI.

Search for:

  • brupop_hosts_state to check how many hosts are in each state.
  • brupop_hosts_version to check how many hosts are in each Bottlerocket version.

Image Region

bottlerocket-update-operator.yaml pulls operator images from Amazon ECR Public. You may also choose to pull from regional Amazon ECR repositories such as the following.

  • 917644944286.dkr.ecr.af-south-1.amazonaws.com
  • 375569722642.dkr.ecr.ap-east-1.amazonaws.com
  • 328549459982.dkr.ecr.ap-northeast-1.amazonaws.com
  • 328549459982.dkr.ecr.ap-northeast-2.amazonaws.com
  • 328549459982.dkr.ecr.ap-northeast-3.amazonaws.com
  • 328549459982.dkr.ecr.ap-south-1.amazonaws.com
  • 328549459982.dkr.ecr.ap-southeast-1.amazonaws.com
  • 328549459982.dkr.ecr.ap-southeast-2.amazonaws.com
  • 386774335080.dkr.ecr.ap-southeast-3.amazonaws.com
  • 328549459982.dkr.ecr.ca-central-1.amazonaws.com
  • 328549459982.dkr.ecr.eu-central-1.amazonaws.com
  • 328549459982.dkr.ecr.eu-north-1.amazonaws.com
  • 586180183710.dkr.ecr.eu-south-1.amazonaws.com
  • 328549459982.dkr.ecr.eu-west-1.amazonaws.com
  • 328549459982.dkr.ecr.eu-west-2.amazonaws.com
  • 328549459982.dkr.ecr.eu-west-3.amazonaws.com
  • 553577323255.dkr.ecr.me-central-1.amazonaws.com
  • 509306038620.dkr.ecr.me-south-1.amazonaws.com
  • 328549459982.dkr.ecr.sa-east-1.amazonaws.com
  • 328549459982.dkr.ecr.us-east-1.amazonaws.com
  • 328549459982.dkr.ecr.us-east-2.amazonaws.com
  • 328549459982.dkr.ecr.us-west-1.amazonaws.com
  • 328549459982.dkr.ecr.us-west-2.amazonaws.com
  • 388230364387.dkr.ecr.us-gov-east-1.amazonaws.com
  • 347163068887.dkr.ecr.us-gov-west-1.amazonaws.com
  • 183470599484.dkr.ecr.cn-north-1.amazonaws.com.cn
  • 183901325759.dkr.ecr.cn-northwest-1.amazonaws.com.cn

Example regional image URI:

328549459982.dkr.ecr.us-west-2.amazonaws.com/bottlerocket-update-operator:v1.1.0

Current Limitations

Troubleshooting

When installed with the default deployment, the logs can be fetched through Kubernetes deployment logs. Because mutations to a node are orchestrated through the API server component, searching those deployment logs for a node ID can be useful. To get logs for the API server, run the following:

kubectl logs deployment/brupop-apiserver --namespace brupop-bottlerocket-aws 

The controller logs will usually not help troubleshoot issues about the state of updates in a cluster, but they can similarly be fetched:

kubectl logs deployment/brupop-controller-deployment --namespace brupop-bottlerocket-aws 

Why are updates stuck in my cluster?

The bottlerocket update operator only installs updates on one node at a time. If a node's update becomes stuck, it can prevent the operator from proceeding with updates across the cluster.

The update operator uses the Kubernetes Eviction API to safely drain pods from a node. The eviction API will respect PodDisruptionBudgets, refusing to remove a pod from a node if it would cause a PDB not to be satisfied. It is possible to mistakenly configure a Kubernetes cluster in such a way that a Pod can never be deleted while still maintaining the conditions of a PDB. In this case, the operator may become stuck waiting for the PDB to allow an eviction to proceed.

Similarly, if the Node in question is repeatedly encountering issues while updating, it may cause updates across the cluster to become stuck. Such issues can be troubleshooted by requesting the update operator agent logs from the node. First, list the agent pods and select the pod residing on the node in question:

kubectl get pods --selector=brupop.bottlerocket.aws/component=agent -o wide --namespace brupop-bottlerocket-aws

Then fetch the logs for that agent:

kubectl logs brupop-agent-podname --namespace brupop-bottlerocket-aws 

Why are my bottlerocket nodes egressing to https://updates.bottlerocket.aws?

The Bottlerocket updater API utilizes TUF repository signed metadata served from a public URL to query and check for updates. The URL is https://updates.bottlerocket.aws and Bottlerocket's updater system requires access to this endpoint in order to perform updates.

Cluster nodes running in production environments with limited network access may not be able to reach this metadata endpoint and you may see failures when updates are available or an update statuses that appears "stuck".

To troubleshoot and validate this, access one of your Bottlerocket nodes on your cluster, and manually execute apiclient update check. If you see errors that indicate Failed to check for updates and network timeouts, ensure that your nodes on cluster have access to https://updates.bottlerocket.aws.

If it is unacceptable to give your Bottlerocket nodes outside network access, you may consider creating a self managed proxy within the cluster that can periodically scrape the TUF repository from https://updates.bottlerocket.aws. This would also require updating the settings.updates.metadata-base-url and settings.updates.targets-base-url settings to point to your proxy. Typically, this is done via tuftool download or tuftool clone.

Users who are building their own Bottlerocket variants and TUF repositories will also need to update their Bottlerocket settings.updates settings to point to their custom TUF repository. But since Brupop simply interfaces with the node's apiclient via the daemonset, no further configurations or changes are required in Brupop. An in depth discussion on building your own TUF repos can be found in the core Bottlerocket repo.

Why do only some of my Bottlerocket instances have an update available?

Updates to Bottlerocket are rolled out in waves to reduce the impact of issues; the container instances in your cluster may not all see updates at the same time. You can check whether an update is available on your instance by running the apiclient update check command from within the control or admin container.

Why do new container instances launch with older Bottlerocket versions?

The Bottlerocket update operator performs in-place updates for instances in your Kubernetes cluster. The operator does not influence how those instances are launched. If you use an auto-scaling group to launch your instances, you can update the AMI ID in your launch configuration or launch template to use a newer version of Bottlerocket.

How to Contribute and Develop Changes

Working on the update operator requires a fully configured and working Kubernetes cluster. Because the agent component relies on the Bottlerocket API to properly function, we suggest a cluster which is running Bottlerocket nodes. The integ crate can currently be used to launch Bottlerocket nodes into an Amazon EKS cluster to observe update-operator behavior.

Have a look at the design to learn more about how the update operator functions. Please feel free to open an issue with an questions!

Building and Deploying a Development Image

To re-generate the yaml manifest found at the root of this reposiroy, simply run:

make manifest

Note: this requires helm to be installed.

Targets in the Makefile can assist in creating an image. The following command will build and tag an image using the local Docker daemon:

make brupop-image

Once this image is pushed to a container registry, you can set environment variables to regenerate a .yaml file suitable for deploying the image to Kubernetes. Firstly, modify the .env file to contain the desired image name, as well as a secret for pulling the image if necessary. Then run the following to regenerate the .yaml resource definitions:

cargo build -p deploy

These can of course be deployed using kubectl apply or the automatic integration testing tool integ.

Security

See CONTRIBUTING for more information.

License

This project is dual licensed under either the Apache-2.0 License or the MIT license, your choice.

bottlerocket-update-operator's People

Contributors

aviorschreiber avatar bcressey avatar cbgbt avatar dani-co-cn avatar dani-rey avatar dependabot[bot] avatar etungsten avatar ginglis13 avatar gliptak avatar gthao313 avatar jahkeup avatar jpculp avatar jpmcb avatar larvacea avatar mooneeb avatar patraw avatar samuelkarp avatar somnusfish avatar srgothi92 avatar stmcginnis avatar tjkirch avatar umizoom avatar vyaghras avatar webern avatar yeazelm avatar ytsssun avatar zmrow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bottlerocket-update-operator's Issues

0.2.0 controller: Create brupop cluster state machine

The brupop controller drives updates across nodes in the cluster by modifying the spec of BottlerocketNode objects associated with each node. To complete this issue, we must create the state machine structure, as well as brupop's logic for driving changes to the state machine based on the cluster's status.

Improve region-specific deployment story

Issue or Feature Request:

#43 adds the list of image repositories by region. The usage of these regional repositories should be improved for users. One way could be to eliminate the need to directly edit deployment data. There are methods to do this using the community's accepted tooling (e.g. helm charts). However, some fact finding and team deliberation is needed before committing to such tools.

Allow update operator to update ASG launch configuration after update

Image I'm using:
328549459982.dkr.ecr.us-west-2.amazonaws.com/bottlerocket-update-operator:v0.1.4

Issue or Feature Request:
If I deploy an ASG with bottlerocket 1.0.0 and the update operator updates instances to 1.1.0 I may run in to problems as the ASG scales up.

New instances would have bottlerocket 1.0.0 and if tagged automatically will reboot from the operator once they join the cluster.
There should be a way to avoid needing to upgrade/reboot new instances by updating the launch template once a user specified threshold of the ASG has been upgraded (eg 50% or 100%).

By having in place upgrades it may also be hard to track what version of the OS is deployed because describing the instance via the AWS API will show an old AMI ID even though the running OS version is up to date.

0.2.0: Combine binaries into a single container image

The build for brupop 0.2.0 currently creates separate container images for each component (apiserver, controller, agent). We should create a single container image to simplify releases and deployment, either by compiling a single Rust binary or by including each binary in the same container image.

updater fails to upgrade to 1.0.8

I'm having trouble with the update operator not updating my nodes to 1.0.8. It's not noticing there is a new image to update.
I'm currently running: ami-05001739c6b8574a6: amazon/bottlerocket-aws-k8s-1.19-x86_64-v1.0.7-099d3398
but I can see ami-04269f2e9be10beeb: amazon/bottlerocket-aws-k8s-1.19-x86_64-v1.0.8-fee7e752 is available

all the nodes are showing:

"bottlerocket.aws/update-available": "false",
"bottlerocket.aws/updater-interface-version": "2.0.0"

neither the controller nor the agents are showing any errors.

kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-21T20:21:49Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.6-eks-49a6c0", GitCommit:"49a6c0bf091506e7bafcdb1b142351b69363355a", GitTreeState:"clean", BuildDate:"2020-12-23T22:10:21Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

Set log format to plain

Image I'm using:

328549459982.dkr.ecr.us-west-2.amazonaws.com/bottlerocket-update-operator:v0.1.2

Issue or Feature Request:

The agent logs are written out with ansi control sequences because a PTY is attached for the agent pod (which is configured to have a TTY). This causes the logging library to select and use its richer user-oriented logging format - including, among other settings, colorized output by way of ansi control sequences. The output of the controller pod, which doesn't have a TTY configured, is plainly printed and friendlier to handle when debugging.

The agent should use the same plain output that the process in the controller's pod selects. Both processes should use an explicitly configured printing format to avoid subtle differences that may influence which format is chosen.

Add delay and smarter verification between node restarts

What I'd like:

Dogswatch should add some delay between the restart of Nodes in a cluster. During this time, the Controller should check in with the Node that has been updated to confirm that it has come up healthy and that Workloads have returned to it. After this, the Controller should have a configurable duration used to delay between each Node restart.

Increase observability

Issue or Feature Request:

I would love to have more observability into what the update operator is doing within our clusters.

The upgrade from 1.0.5 to 1.0.6 happened over the past week and I had little to no idea it was taking place.

Ideally the exposure of some prometheus metrics that provide the following information

  • The agents current status (e.g. no updates pending, update pending, update complete)
  • The status of the wave execution (number of nodes completed, number remaining)

From this the creation of a community owned Grafana dashboard visualising this information.

Node is not properly drained

Image I'm using:
328549459982.dkr.ecr.eu-west-1.amazonaws.com/bottlerocket-update-operator:v0.1.4

Deployment manifest:

apiVersion: v1
kind: Namespace
metadata:
  name: bottlerocket
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bottlerocket-update-operator-controller
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch", "update", "patch"]
  # Allow the controller to remove Pods running on the Nodes that are updating.
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bottlerocket-update-operator-controller
subjects:
  - kind: ServiceAccount
    name: update-operator-controller
    namespace: bottlerocket
roleRef:
  kind: ClusterRole
  name: bottlerocket-update-operator-controller
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bottlerocket-update-operator-agent
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bottlerocket-update-operator-agent
subjects:
  - kind: ServiceAccount
    name: update-operator-agent
    namespace: bottlerocket
roleRef:
  kind: ClusterRole
  name: bottlerocket-update-operator-agent
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: update-operator-controller
  namespace: bottlerocket
  annotations:
    kubernetes.io/service-account.name: update-operator-controller
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: update-operator-agent
  namespace: bottlerocket
  annotations:
    kubernetes.io/service-account.name: update-operator-agent
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: update-operator-controller
  namespace: bottlerocket
  labels:
    update-operator: controller
spec:
  replicas: 1
  strategy:
    rollingUpdate:
      maxUnavailable: 100%
  selector:
    matchLabels:
      update-operator: controller
  template:
    metadata:
      namespace: bottlerocket
      labels:
        update-operator: controller
        app: bottlerocket-update-operator
      annotations:
        log: "true"
    spec:
      serviceAccountName: update-operator-controller
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: bottlerocket.aws/updater-interface-version
                    operator: Exists
                  - key: "kubernetes.io/os"
                    operator: In
                    values:
                      - linux
                  - key: "kubernetes.io/arch"
                    operator: In
                    values:
                      - amd64
                      - arm64
        # Avoid update-operator's Agent Pods if possible.
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 10
              podAffinityTerm:
                topologyKey: bottlerocket.aws/updater-interface-version
                labelSelector:
                  matchExpressions:
                    - key: update-operator
                      operator: In
                      values: ["agent"]
      containers:
        - name: controller
          image: "328549459982.dkr.ecr.${AWS_REGION}.amazonaws.com/bottlerocket-update-operator:v0.1.4"
          imagePullPolicy: Always
          args:
            - -controller
            - -debug
            - -nodeName
            - $(NODE_NAME)
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
---
# This DaemonSet is for Bottlerocket hosts that support updates through the Bottlerocket API (Bottlerocket OS versions >= v0.4.1)
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: update-operator-agent-update-api
  namespace: bottlerocket
  labels:
    update-operator: agent
spec:
  selector:
    matchLabels:
      update-operator: agent
  template:
    metadata:
      labels:
        update-operator: agent
        app: bottlerocket-update-operator
      annotations:
        log: "true"
    spec:
      serviceAccountName: update-operator-agent
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: "bottlerocket.aws/updater-interface-version"
                    operator: In
                    values:
                      - 2.0.0
                  - key: "kubernetes.io/os"
                    operator: In
                    values:
                      - linux
                  - key: "kubernetes.io/arch"
                    operator: In
                    values:
                      - amd64
                      - arm64
      hostPID: true
      containers:
        - name: agent
          image: "328549459982.dkr.ecr.${AWS_REGION}.amazonaws.com/bottlerocket-update-operator:v0.1.4"
          imagePullPolicy: Always
          args:
            - -agent
            - -debug
            - -nodeName
            - $(NODE_NAME)
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          resources:
            limits:
              memory: 50Mi
            requests:
              cpu: 10m
              memory: 50Mi
          volumeMounts:
            - name: bottlerocket-api-socket
              mountPath: /run/api.sock
          securityContext:
            seLinuxOptions:
              user: system_u
              role: system_r
              type: super_t
              level: s0
      volumes:
        - name: bottlerocket-api-socket
          hostPath:
            path: /run/api.sock
            type: Socket

Node info:

image

Issue or Feature Request:

Hello, we have noticed that node is not properly drained during update. Update operator doesn't wait until all pods on node are evicted and reboots node immediately which leads to service interruption. The eviction of pods is probably not started at all.

Operator logs could not drain with error User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"kube-system\" followed by proceeding anyway , see more details below.

Update operator log during reboot:

2021-09-18T15:53:16.000Z bottlerocket-update-operator controller--b4c55546b-sp4br time="2021-09-18T15:53:15Z" level=error msg="could not drain" component=controller error="[cannot delete daemonsets.apps \"kube-proxy\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"kube-system\": kube-system/kube-proxy-n2hzd, cannot delete daemonsets.apps \"update-operator-agent-update-api\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"bottlerocket\": bottlerocket/update-operator-agent-update-api-pmv6k, cannot delete daemonsets.apps \"datadog-agent\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"datadog\": datadog/datadog-agent-cqx69, cannot delete daemonsets.apps \"calico-node\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"kube-system\": kube-system/calico-node-s8vrr, cannot delete daemonsets.apps \"fluentd-papertrail-containerd\" is forbidden: User \"system:serviceaccount:bottlerocket:update-operator-controller\" cannot get resource \"daemonsets\" in API group \"apps\" in the namespace \"kube-system\": kube-system/fluentd-papertrail-containerd-g29tm]" intent="reboot-update,perform-update,ready update:true" node=ip-10-233-157-101.eu-west-1.compute.internal worker=manager
2021-09-18T15:53:16.000Z bottlerocket-update-operator controller--b4c55546b-sp4br time="2021-09-18T15:53:15Z" level=warning msg="proceeding anyway" component=controller intent="reboot-update,perform-update,ready update:true" node=ip-10-233-157-101.eu-west-1.compute.internal worker=manager

If I add permissions to update controller:

- verbs:
      - get
      - list
    apiGroups:
      - apps
    resources:
      - daemonsets
      - replicasets

Following error is logged cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore):

2021-09-18T16:17:37.000Z bottlerocket-update-operator controller--b4c55546b-sp4br time="2021-09-18T16:17:36Z" level=error msg="could not drain" component=controller error="cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): bottlerocket/update-operator-agent-update-api-l5jvm, datadog/datadog-agent-tch22, kube-system/calico-node-dhn88, kube-system/fluentd-papertrail-containerd-p65zh, kube-system/kube-proxy-mvxcb" intent="reboot-update,perform-update,ready update:true" node=ip-10-233-156-93.eu-west-1.compute.internal worker=manager
2021-09-18T16:17:37.000Z bottlerocket-update-operator controller--b4c55546b-sp4br time="2021-09-18T16:17:36Z" level=warning msg="proceeding anyway" component=controller intent="reboot-update,perform-update,ready update:true" node=ip-10-233-156-93.eu-west-1.compute.internal worker=manager

Can I somehow configure deamonsets ignore on drain in update operator?
thanks

Can we reduce the memory of 600Mi in the update-operator daemon sets configuration?

Feature Request:

The bottlerocket-update-operator is configured to request 600 MB of memory by default:

Can we reduce the memory less than 600Mi for both updog and also the API DaemonSet in the file and deploy?

I am sure it will work initially with out any pods and the load. but not sure in future do we face any issue because of this modifications.

Can we safely run it with less memory? Also want to know why it is recommended 600 MB as the request for this both DaemonSets?

0.2.0 agent: Implement `BottlerocketNode` Custom Resource Updates

The brupop agent must gather the current state of the system and then compare it to that stored in k8s as a custom resource. This task involves gathering the system state and comparing it to the existing resource, then making requests to the apiserver if any differences are detected.

Drop updog integration

Image I'm using:

n/a

Issue or Feature Request:

The updog platform integration needs to be deprecated and removed entirely. Nodes now should (and must) use the update-api to perform updates. Bottlerocket's SELinux policies prevent updog from functioning and is/was only suitable for use with images released during public preview.

  • log any remaining use of deprecated updog integration
  • remove updog integration code (in a later release)

Related

#47

Add more logs

Description
Currently controller logs are very limited even with the debug flag on. Due to this it's hard to understand what is happening in between the labels received and labels posted. Additional logs needs to be added before starting each new step like update, cordon, drain, verify update etc. to better understand update progression and debug issues.

Current logs:

time="2021-07-26T15:55:30Z" level=debug msg="resource update event" component=controller worker=informer
time="2021-07-26T15:55:30Z" level=debug msg="handling event" component=controller node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T15:55:30Z" level=debug msg="not queuing duplicate intent" component=controller intent="reboot-update,reboot-update,ready update:true" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:06Z" level=debug msg="resource update event" component=controller worker=informer
time="2021-07-26T16:00:06Z" level=debug msg="handling event" component=controller node=ip-2-2-2-2.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:06Z" level=debug msg="no action needed" component=controller intent="stabilize,stabilize,ready update:false" node=ip-2-2-2-2.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="resource update event" component=controller worker=informer
time="2021-07-26T16:00:26Z" level=debug msg="handling event" component=controller node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="queue intent" component=controller intent="reboot-update,reboot-update,ready update:true" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="checking with policy" component=controller intent="reboot-update,reboot-update,ready update:true" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="handling permitted intent" component=controller intent="reboot-update,reboot-update,ready update:true" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="handling successful update" component=controller intent="reboot-update,reboot-update,ready update:true" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="posted intent" component=controller intent="stabilize,unknown,unknown update:unknown" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="resource update event" component=controller worker=informer
time="2021-07-26T16:00:26Z" level=debug msg="handling event" component=controller node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="intent is not yet realized" component=controller intent="stabilize,unknown,unknown update:unknown" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="resource update event" component=controller worker=informer
time="2021-07-26T16:00:26Z" level=debug msg="handling event" component=controller node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:26Z" level=debug msg="intent is not yet realized" component=controller intent="stabilize,stabilize,busy update:unknown" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:36Z" level=debug msg="resource update event" component=controller worker=informer
time="2021-07-26T16:00:36Z" level=debug msg="handling event" component=controller node=ip-1-1-1-1.us-west-2.compute.internal worker=manager
time="2021-07-26T16:00:36Z" level=debug msg="no action needed" component=controller intent="stabilize,stabilize,ready update:false" node=ip-1-1-1-1.us-west-2.compute.internal worker=manager

Image I'm using:
0.1.4

Issue or Feature Request:
Issue

Consider combining license scan and build Dockerfiles

I'd like to see the separation in Dockerfiles go away later. Coordinating two separate image builds by way of a Makefile used to be a requirement prior to the advent of multi-stage Dockerfiles, but at this point multi-stage Dockerfiles are a more clear representation of the dependencies between build steps of container images. The fact that we're having this discussion and the need for extra documentation is an artifact of this being a non-standard pattern; I see this as a barrier to entry for new contributors who will need to read two Dockerfiles and a Makefile to understand how the image is built.

Originally posted by @samuelkarp in #6

Coordinate node readiness with pod workload checks and health reporting

What I'd like:

Dogswatch should check on the status of currently running Pod workloads on a Node before considering an update to be possible. The Controller should verify that the Pods that are about to be terminated are in a healthy state that the service (to be impacted) will remain available elsewhere in the cluster prior to removing the workload from a Node.

Ideally, it would be configurable to conditionally handle the termination of transient Pods that are not controlled by higher level schedulers (the likes ReplicaSets or Deployments). I'm sure there are many other configurables that could easily come into play in this particular critical path - thought should be given to how it could be extended to handle additional considerations.

Add a chaotic (artificially busy & updating) mode

Add a mode to enable behavior testing of customer workloads when a cluster performs updates by constantly applying updates and committing them rapidly.

This needs discussion to determine the appropriate behavior but may be as simple as a noop update + reboot to "commit" the update with a constant delayed reporting of "update available".

upgrade left half of cluster cordoned

Image I'm using:
328549459982.dkr.ecr.us-west-2.amazonaws.com/bottlerocket-update-operator:v0.1.2

Issue or Feature Request:

After the recent release of Bottlerocket OS 0.3.1 my nodes started automatically upgrading as expected. After a couple of hours, all of the nodes were upgraded but half of there were still in the Ready,SchedulingDisabled state. I manually uncordoned the nodes and everything was find after that.

I have attached the pod logs for the controller and agents.

pod.zip

Use SDK for building Go binaries

Issue or Feature Request:

The build process for the update operator should use the same Go toolchain used throughout Bottlerocket.

This can be accomplished by subbing in the SDK container where the golang:N.N image was being used before. The license scanner (#6) should also use this image when collecting the modules into the to-be-scanned vendor/ directory.

Suggested Solution

The "golang" image can be provided as an ARG to the build commands (as is proposed in #6 for other uses - see Dockerfile.licenses). I was able to make a set of changes to do this in short order (and still support docker build -t neio:latest . at the root). Here's a gist with the set of changes.

Provide a mechanism to prevent updates in certain time window

Description

Customers could have workload that cannot be interrupted during some periods of the day. Therefore, a mechanism to prevent Bottlerocket nodes update by the bottlerocket-update-operator in some time window is useful.

Workarounds

  1. Since, the Bottlerocket update operator expects Kubernetes label updater-interface-version on nodes for it to start agent DaemonSet. Customer can have an external mechanism to control nodes label updater-interface-version.
  2. Lock the version of all Bottlerocket hosts to desired version by setting settings.updates.version-lock (details) and have a mechanism that changes the settings on nodes to latest whenever you would like to update your cluster.

Issue or Feature Request:
Feature Request

Remove mitigation code paths

Issue or Feature Request:

The "mitigations" functionality in the bottlerocket package, and its mitigations, are no longer needed. Remove its code and code paths that use it.

Use Bottlerocket SDK latest release

Issue or Feature Request:

The update operator is building with an older SDK, this should be updated to the latest version (SDK v0.14.0 at time of writing).

This brings in a newer version of Go as well, so the fallback golang image and the GitHub action build should be updated to use 1.15.4 (in SDK v0.14.0).


  • Bump SDK container image ref
  • Run tests with builds (updated Go toolchain)
  • Reconfigure golang fallback image
  • Reconfigure GitHub Actions build to use matching version

Add getting started guide

Issue or Feature Request:

The repository should include directed, "getting started" documentation that walks through standing up the update operator in a cluster. This documentation should also provide pointers to some details (such as how the host influences updates and delays) as they're mentioned or relevant in the guide.

Project Status

Issue or Feature Request:

What is the status of this project? Is it still maintained and the recommended way to upgrade bottlerocket? There are a number of significant issues presented in the README and I'm wondering if there is a roadmap or timeline to address them.

In my view, the non-operator mechanisms for upgrading bottlerocket are quite convoluted so it would be nice to see this project get some attention.

If this project is not intended to continue, it would be nice to see efforts contributed to a project like System Upgrade Controller instead: https://github.com/rancher/system-upgrade-controller

v0.3.0 referenced broken deployment combination

Image I'm using:

328549459982.dkr.ecr.us-west-2.amazonaws.com/dogswatch:v0.1.2

Issue or Feature Request:

The agent is unable to perform any update and is not scheduled without inappropriate labels set (thar.amazonaws.com/platform-version and bottlerocket.amazonaws.com both have to be set).

This is due to the release including the updated (renamed / sed'd) deployment that references bottlerocket in place of thar. However, the container image - 328549459982.dkr.ecr.us-west-2.amazonaws.com/dogswatch:v0.1.2 - still has thar named references. This must be addressed with another release that includes the changes from #3 and its documentation changes as well.

Idea: provide update metadata for agent usage

When performing updates, dogswatch is able to observe the cluster from the context of the orchestrated environment and on the individual hosts by way of the communicated annotations and Thar metadata. These details can be extended as needed to empower rich policy controls or to improve the stability of Thar hosts in any given cluster.

This is a list of ideas and interesting metadata that can be put to use immediately for a given update offered to/by the node (N.B. this list is not exhaustive, not prioritized, and not necessary today):

  • version
  • variant
  • security brief
  • impacted keys
  • impacted versions
  • build brief
  • source
  • timestamp
  • migration brief
  • impacted keys
  • update phase group identifier

Aspirational data & niceties for feedback to improve stability/visibility:

  • update reporting URL

    Reporting for update decision making:

    • success
    • failure
    • withheld (for update policy)
    • withheld (for workload policy)
  • self reference URL (remote copy) to update metadata

    Helping to avoid overloading annotations with extensive data.

Remove `dogswatch-platform`

Issue or Feature Request:

The dogswatch-platform command is not used during build or runtime of Bottlerocket to determine the host's "platform-version" (#4) - as it was intended to be. This little cmd package should be removed in the mean time.

Clarify updater-interface-version

This is more of a documentation request. Would it be possible to clarify a few things regarding updater-interface-version in the README? It took some digging for me to find out. Like:

  1. 1.0.0 vs 2.0.0 (updog vs API). It's unclear what is version 1 and what is v2 and why they are versioned. Simply using "updog" vs "api" instead would help, I see there was a comment about it in the pull request.
  2. When should one use Updog instead of API, and is there any reason to use "1.0.0" on Bottlerocket v1.0+?
  3. In cases where API si fine, should the update-operator-agent-updog DaemonSet be removed from the K8s manifest? 600 MiB is a lot of memory for a container that runs on every node. And the manifest has two of those. That's 1.2 GiB, which is huge. See #47.

Change `context.TODO()` with better context handling

Image I'm using:
N/A

Issue or Feature Request:
Issue

Description

Kubernetes go client library was updated as part of PR-70, but the newer version requires context in lots of API's. The context was difficult to determine so It was set as context.TODO() as part of PR-70. We should change it to more meaningful context.

brupop 0.2.0 apiserver: k8s token-based authorization

brupop agents will utilize token volume projection to receive a pod-unique token. The brupop apiserver should accept this data in a header for requests to modify any BottlerocketNode custom resources. It should use this token to determine the identity of the requester using the k8s Token APIs, and use that information to authorize the requests to modify said custom resources.

Idea: transform intent into runnable behavior

Yeah, I think this is the best place to start improvements (alongside other coordination improvements). I really liked the idea you proposed of using Behavior Trees and really would like to see what an inspired implementation looks like. I think some hybrid that produced predefined Actions (with concrete types behind them) would be slick and enable some testing of responses other than of the subsequent mutations of the annotations.

I don't think we'll be able to get away from the 3 variables - for many reasons as we discussed: we need to be able to observe pending/active actions taken (for ACKs and monitoring) and also to be able to discern between current (or reached state) and the desired next state. We may be able to collapse these into 2 annotations thought that may severely overload them, I'm open to the discussion and changing of these annotations. I'm interested to hear some alternative ideas we could put in place here.

Originally posted by @jahkeup in bottlerocket-os/bottlerocket#239

0.2.0 agent: Implement Fetching node-associated custom resource

This task is straightforward, but also implies having k8s configured on the client. The apiserver is only needed for write operations to the BottlerocketNode custom resources. The node will need to implement fetching its the status of its current resource in order to compare it to the state gathered on the node.

Log no-update-available result from updater interface as non-error

Image I'm using:

328549459982.dkr.ecr.us-west-2.amazonaws.com/bottlerocket-update-operator:v0.1.2

Issue or Feature Request:

When the update operator agent checks for an update, the logged messages makes the result look like an error - though its an expected behavior/response from the call to the updater.

Instead of making this logged as an error, this should instead be logged as a "no update available" result based on the returned status code and the updater's message.

pod-based constraints (blocking-pod-selector)

Issue or Feature Request:
Hello, is it possible to block reboot of a node in case there's running a workload with certain labels?

A good example is https://github.com/weaveworks/kured, this tool has a flag --blocking-pod-selector that postpones reboot of a node until it has no pods with the specified labels scheduled.

This feature would be great to prevent restarting of expensive tasks.

Thanks in advance!

Version update control with GitOps

Issue or Feature Request:

Is it possible to upgrade only when prompted by a change to a kubernetes manifest? Automated upgrades are great, but it would be nice to track and trigger these through some auditable process. Ideally we could have a Bottlerocket CRD and change only the version number, flavor or kubernetes version to trigger an upgrade.

My current understanding is that this is not possible?

Automatically add label to Bottlerocket nodes for agent scheduling

What I'd like:

The update operator should automatically be eligible for scheduling on to Bottlerocket hosts in a Kubernetes cluster.

The suggested deployment uses a label to identify Bottlerocket hosts and schedule on them (ie: the bottlerocket.aws/platform-version label. Name may change: #4). Instead of requiring the label to be set by administrators, the label could be set (or determined) automatically for Bottlerocket nodes to eliminate the manual step.

Bottlerocket update-operator-controller, Status "Pending", nodes didn't match node selector

Image I'm using:
Cluster config derived from https://github.com/bottlerocket-os/bottlerocket/blob/develop/sample-eksctl.yaml


apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: clustername
  region: eu-central-1
  version: "1.16"

vpc:
  id: vpc-asdf
  clusterEndpoints:
    publicAccess:  true
    privateAccess: true
  securityGroup: sg-asdf
  sharedNodeSecurityGroup: sg-asdf
  subnets:
    private:
      eu-central-1a: { id: subnet-asdf}
      eu-central-1b: { id: subnet-asfd}
      eu-central-1c: { id: subnet-asdf}
    public:
      eu-central-1a: { id: subnet-asdf}
      eu-central-1b: { id: subnet-asdf}
      eu-central-1c: { id: subnet-asdf}

nodeGroups:
  - name: spot-group-bottlerocket
    labels:
      role: spot-worker
      spotfleet: "yes"
    minSize: 1
    desiredCapacity: 1
    **amiFamily: Bottlerocket**
    maxSize: 10
    availabilityZones:
    - eu-central-1a
    privateNetworking: true
    volumeEncrypted: true
    volumeType: gp2
    volumeSize: 50
    instancesDistribution:
      instanceTypes:
      - m5a.large
      - m5.large
      - m5d.large
      onDemandBaseCapacity: 0
      onDemandPercentageAboveBaseCapacity: 0
      spotInstancePools: 2
    securityGroups:
      attachIDs:
      - sg-asdf
      withLocal: false
      withShared: true
    iam:
      attachPolicyARNs:
      - arn:aws:iam::asdf
    bottlerocket:
      settings:
        motd: "Hello from eksctl!"
    tags:
      Projekt: EKS
      k8s.io/cluster-autoscaler/enabled: "true"
      k8s.io/cluster-autoscaler/clustername: owned
      bottlerocket.aws/updater-interface-version: 1

Flux custom template derived from https://github.com/bottlerocket-os/bottlerocket-update-operator/blob/develop/update-operator.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: bottlerocket
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bottlerocket-update-operator-controller
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch", "update", "patch"]
  # Allow the controller to remove Pods running on the Nodes that are updating.
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bottlerocket-update-operator-controller
subjects:
  - kind: ServiceAccount
    name: update-operator-controller
    namespace: bottlerocket
roleRef:
  kind: ClusterRole
  name: bottlerocket-update-operator-controller
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bottlerocket-update-operator-agent
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "list", "watch", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bottlerocket-update-operator-agent
subjects:
  - kind: ServiceAccount
    name: update-operator-agent
    namespace: bottlerocket
roleRef:
  kind: ClusterRole
  name: bottlerocket-update-operator-agent
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: update-operator-controller
  namespace: bottlerocket
  annotations:
    kubernetes.io/service-account.name: update-operator-controller
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: update-operator-agent
  namespace: bottlerocket
  annotations:
    kubernetes.io/service-account.name: update-operator-agent
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: update-operator-controller
  namespace: bottlerocket
  labels:
    update-operator: controller
spec:
  replicas: 1
  strategy:
    rollingUpdate:
      maxUnavailable: 100%
  selector:
    matchLabels:
      update-operator: controller
  template:
    metadata:
      namespace: bottlerocket
      labels:
        update-operator: controller
    spec:
      serviceAccountName: update-operator-controller
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: bottlerocket.aws/updater-interface-version
                    operator: Exists
                  - key: "kubernetes.io/os"
                    operator: In
                    values:
                      - linux
                  - key: "kubernetes.io/arch"
                    operator: In
                    values:
                      - amd64
        # Avoid update-operator's Agent Pods if possible.
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 10
              podAffinityTerm:
                topologyKey: bottlerocket.aws/updater-interface-version
                labelSelector:
                  matchExpressions:
                    - key: update-operator
                      operator: In
                      values: ["agent"]
      containers:
      - name: controller
        image: "asdf.amazonaws.com/bottlerocket-update-operator:v0.1.3"
        imagePullPolicy: Always
        args:
          - -controller
          - -debug
          - -nodeName
          - $(NODE_NAME)
        env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: update-operator-agent
  namespace: bottlerocket
  labels:
    update-operator: agent
spec:
  selector:
    matchLabels:
      update-operator: agent
  template:
    metadata:
      labels:
        update-operator: agent
    spec:
      serviceAccountName: update-operator-agent
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: bottlerocket.aws/updater-interface-version
                    operator: Exists
                  - key: "kubernetes.io/os"
                    operator: In
                    values:
                      - linux
                  - key: "kubernetes.io/arch"
                    operator: In
                    values:
                      - amd64
      hostPID: true
      containers:
        - name: agent
          image: "asdf.amazonaws.com/bottlerocket-update-operator:v0.1.3"
          imagePullPolicy: Always
          # XXX: tty required to exec binaries that use `simplelog` until https://github.com/bottlerocket-os/bottlerocket/issues/576 is resolved.
          tty: true
          args:
            - -agent
            - -debug
            - -nodeName
            - $(NODE_NAME)
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          securityContext:
            # Required for executing OS update operations.
            privileged: true
          resources:
            limits:
              memory: 600Mi
            requests:
              cpu: 100m
              memory: 600Mi
          volumeMounts:
            - name: rootfs
              mountPath: /.bottlerocket/rootfs
      volumes:
        - name: rootfs
          hostPath:
            path: /
            type: Directory

What I expected to happen:
update operator controller should be added to cluster and should be running

What actually happened:
update operator controller was added to cluster and never scheduled.

Pod
   Type     Reason             Age                   From                Message                                                                                           │
│   ----     ------             ----                  ----                -------                                                                                           │
│   Normal   NotTriggerScaleUp  3m37s (x61 over 13m)  cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added): 3 node(s) didn't match node │
│  selector                                                                                                                                                                 │
│   Warning  FailedScheduling   44s (x11 over 13m)    default-scheduler   0/3 nodes are available: 3 node(s) didn't match node selector.   

Node-Selectors:  <none>    
│ Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s                                                                                                          │
│                  node.kubernetes.io/unreachable:NoExecute for 300s 

Node
│ Taints:             <none>    

How to reproduce the problem:
Set up an EKS Cluster v 1.16, use files like those above. Pod ist not scheduled.

Change `platform-version` key name

Issue or Feature Request:

platform-version used in the Label bottlerocket.aws/platform-version for picking compatible hosts to schedule an Operator's Agent Pod stands to be better named. The current name does not convey the Label's meaning and is overloaded in use (ex: used by EKS and Fargate). The name platform-version is replaced with should make it clear to Cluster operators that it's a host integration compatibility version.

Some suggested keys:

Name Full Key
updater-interface-version bottlerocket.aws/updater-interface-version
host-update-interface-version bottlerocket.aws/host-update-interface-version
host-update-integration-version bottlerocket.aws/host-update-integration-version

My preference: updater-interface-version

With updater-interface-version: bottlerocket.aws/updater-interface-version=1.0.0.

This naming would also nicely relate other keys, if needed. For example, if this were the selector for a chosen backing "updater" implementation: bottlerocket.aws/updater-implementation=noop. I don't think we'll need anything of the sort soon, but it's a nice thought and reads as a very straightforward relationship to me.

Native builder required for targeting different ARCH

Image I'm using:

n/a

Issue and Feature Request:

The build process is unable to produce an arm64v8 architecture specific image. The build's Dockerfile uses a scratch based container image and, on my machine at least, the requested architecture is not set for this image (with ARCH=arm64).

docker build uses the native architecture for scratch and does not support setting platform in builder. Using Docker's buildx works when setting platform at the top of the build:

docker buildx build --platform linux/arm64 ... # existing arguments

However, this would require users to install the buildx plugin and isn't a reasonable ask.

This issue should define a solution towards a correctly building a non-native architecture targeted image OR take steps to set the image's correct architecture at/following build.

brupop 0.2.0 apiserver: Create/update node endpoints

Create the web endpoints that agents use to modify the custom resources associated with their nodes. The APIServer should accept JSON-encoded requests containing the BottlerocketNode object state and update the correct custom resource so that its status reflects that state.

0.2.0: Remove openssl build from brupop's Dockerfile

We should integrate our openssl compiled against musl into the Bottlerocket SDK image, or otherwise utilize a build process which will allow us to properly dynamically link against openssl at runtime.

One stopgap option could be to dynamically link against the openssl currently in the SDK image, and do away with our scratch-based resulting image for the time being.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.